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revised the book and published its fourth edition in 
Russian in 1985. 
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by Chicago University Press without the author's per- 
mission (the USSR joined the Universal Copyright 
Convention only in 1973). The second English pub- 
lication was by Mir Publishers, in 1975 and 1985 
(translations of the second and third Russian edi- 
tions). 

The fourth edition of the book was translated only 
into German ( 199 1, Deutscher Verlag de Wissenschaf- 
ten). 



abstract 



The Monte Carlo method is a numerical method 
of solving mathematical problems by random sam- 
pling. As a universal numerical technique, the Monte 
Carlo method could only have emerged with the ap- 
pearance of computers. The field of application of the 
method is expanding with each new computer gener- 
ation. 

This book contains the main schemes of the Monte 
Carlo method and various examples of how the meth- 
od can be used in queuing theory, quality and reli- 
ability estimations, neutron transport, astrophysics, 
and numerical analysis. 

The principal goal of the book is to show resear- 
chers, engineers, and designers in various areas (sci- 
ence, technology, industry, medicine, economics, agri- 
culture, trade, etc.) that they may encounter prob- 
lems in their respective fields that can be solved by 
the Monte Carlo method. 

The reader is assumed to have only a basic know- 
ledge of elementary calculus. Section 2 presents the 
concept of random variables in a simple way, which 
is quite enough for understanding the simplest pro- 
cedures and applications of the Monte Carlo method. 

The fourth revised and enlarged Russian edition 
(1985; German trans. 1991) can be used as a uni- 
versity textbook for students-nonmathematicians. 



preface 



The principal goal of this book is to suggest to 
specialists in various areas that there are problems 
in their fields that can be solved by the Monte Carlo 
method. 

Many years ago I agreed to deliver two lectures on 
the Monte Carlo method, at the Department of Com- 
puter Technology of the Public University in Moscow. 
Shortly before the first lecture, I discovered, to my 
horror, that most of the audience was unfamiliar with 
probability theory. It was too late to retreat: more 
than two hundred listeners were eagerly waiting. Ac- 
cordingly, I hurriedly inserted in the lecture a sup- 
plementary part that surveyed the basic concepts of 
probability. This book's discussion of random vari- 
ables in Chapter 1 is an outgrowth of that part, and 
I feel that I must say a few words about it. 

Everyone has heard, and most have even used, 
the words "probability" and "random variable." The 
intuitive idea of probability (considered as frequency) 
more or less corresponds to the true meaning of the 
term. But the layman's notion of a random variable 
is rather different from the mathematical definition. 
Therefore, the concept of probability is assumed to 
be understood, and only the more complicated con- 
cept of the random variable is clarified in the first 



chapter. This explanation cannot replace a course in 
probability theory: the presentation here is simpli- 
fied, and no proofs are given. But it does give the 
reader enough acquaintance with random variables 
for an understanding of Monte Carlo techniques. 

The problems considered in Chapter 2 are fairly 
simple and have been selected from diverse fields. 
Of course, they cannot encompass all the areas in 
which the method can be applied. For example, not 
a word in this book is devoted to medicine, although 
the method enables us to calculate radiation doses 
in X-ray therapy (see Computation of Neutron Trans- 
mission Through a Plate in Chapter 2). If we have a 
program for computing the absorption of radiation in 
various body tissues, we can select the dosage and 
direction of irradiation that most efficiently ensures 
that no harm is done to healthy tissues. 

The Russian version of this book is popular, and is 
often used as a textbook for students-nonmathemati- 
cians. To provide greater mathematical depth, the 
fourth Russian edition includes a new Chapter 3 that 
is more advanced than the material presented in the 
preceding editions (which assumed that the reader 
had only basic knowledge of elementary calculus). 
The present edition also contains additional informa- 
tion on different techniques for modeling random vari- 
ables, an approach to quasi-Monte Carlo methods, 
and a modern program for generating pseudorandom 
numbers on personal computers. 

Finally, I am grateful to Dr. E. Gelbard (Argonne 
National Laboratory) for encouragement in the writ- 
ing. 

I. SoboP 
Moscow, 1993 



introduction 



general idea of the method 

The Monte Carlo method is a numerical method of 
solving mathematical problems by the simulation of 
random variables. 

The Origin of the Monte Carlo Method 

The generally accepted birth date of the Monte 
Carlo method is 1949, when an article entitled The 
Monte Carlo method" by Metropolis and Ulam 1 ap- 
peared. The American mathematicians John von Neu- 
mann and Stanislav Ulam are considered its main 
originators. In the Soviet Union, the first papers on 
the Monte Carlo method were published in 1955 and 
1956 by V. V. Chavchanidze, Yu. A. Shreider and 
V. S. Vladimirov. 

Curiously enough, the theoretical foundation of 
the method had been known long before the von 
Neumann-Ulam article was published. Furthermore, 
well before 1949 certain problems in statistics were 
sometimes solved by means of random sampling — 
that is, in fact, by the Monte Carlo method. However, 



because simulation of random variables by hand is 
a laborious process, use of the Monte Carlo method 
as a universal numerical technique became practical 
only with the advent of computers. 

As for the name "Monte Carlo," it is derived from 
that city in the Principality of Monaco famous for its 
. . . casinos. The point is that one of the simplest 
mechanical devices for generating random numbers 
is the roulette wheel. We will discuss it in Chap- 
ter 2 under Generating Random Variables on a Com- 
puter. But it appears worthwhile to answer here one 
frequently asked question: "Does the Monte Carlo 
method help one win at roulette?" The answer is No; 
it is not even an attempt to do so. 

Example: the "Hit-or-Miss" Method 

We begin with a simple example. Suppose that 
we need to compute the area of a plane figure S. This 
may be a completely arbitrary figure with a curvilin- 
ear boundary; it may be defined graphically or analyt- 
ically, and be either connected or consisting of several 
parts. Let S be the region drawn in Figure 1, and let 
us assume that it is contained completely within a 
unit square. 

Choose at random N points in the square and des- 
ignate the number of points that happen to fall inside 
S by N'. It is geometrically obvious that the area of S 
is approximately equal to the ratio N'/N. The greater 
the N, the greater the accuracy of this estimate. 

The number of points selected in Figure 1 is N = 
40. Of these, N' = 12 points appeared inside S. The 
ratio N'/N = 12/40 = 0.30, while the true area of S is 
0.35. 

In practice, the Monte Carlo method is not used for 
calculating the area of a plane figure. There are other 
methods (quadrature formulas) for this, that, though 
they are more complicated, provide much greater ac- 
curacy. 




Fig. 1. N random points in the square. Of 
these, N' points are inside S. The area of S is 
approximately N'/N. 



However, the hit-or-miss method shown in our ex- 
ample permits us to estimate, just as simply, the 
"multidimensional volume" of a body in a multidimen- 
sional space; in such a case the Monte Carlo method 
is often the only numerical method useful in solving 
the problem. 

Two Distinctive Features 
of the Monte Carlo Method 

One advantageous feature of the Monte Carlo 
method is the simple structure of the computation 
algorithm. As a rule, a program is written to carry 
out one random trial (in our previous "hit-or-miss" 
example one has to check whether a selected ran- 




Fig. 2. N random hits in the square. Of these, 
N' hits inside S. Is the area approximately 

N'/N? 



come acquainted with the definition of random vari- 
ables and with some of their properties. This infor- 
mation is presented in the first part of Chapter 1 
under Random Variables. A reader who is familiar 
with probability theory may omit this section, except 
for the discussion entitled The General Scheme of the 
Monte Carlo Method. 

The procedures by which the random points in 
Figures 1 and 2 are actually computed are revealed 
at the end of Chapter 1 (see Again About the Hit-or- 
Miss Examples). 
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random variables 

We assume that the reader is acquainted with the 
concept of probability, and we turn directly to the 
concept of a random variable. 

The words "random variable," in ordinary lay us- 
age, connote that one does not know what value a 
particular variable will assume. However, for math- 
ematicians the term "random variable" has a precise 
meaning: though we do not know this variable's value 
in any given case, we do know the values it can as- 
sume and the probabilities of these values. The result 
of a single trial associated with this random variable 
cannot be precisely predicted from these data, but we 
can predict very reliably the result of a great number 
of trials. The more trials there are (the larger the 
sample) , the more accurate our prediction. 

Thus, to define a random variable, we must indi- 
cate the values it can assume and the probabilities 
of these values. 



Discrete Random Variables 

A random variable £ is called discrete if it can as- 
sume any of a set of discrete values xi, x 2 , . . ■ , x n . A 
discrete random variable is therefore defined by a ta- 
ble 

' X\ X2 ... X n 



(X) 
Pi Pi •■• PnJ 

where xi, x 2 , . . . , x n are the possible values of £, and 
Pi,p 2 ,...,p„ are the corresponding probabilities. To 
be precise, the probability that the random variable £ 
will be equal to x f (denoted by P{£ = x t }) is equal to 
Pi- 

Ptf = *.■}= W 
Table (T) is called the distribution of the random vari- 
able f . 

The values xi, x 2 , . . . , x n can be arbitrary.* How- 
ever, the probabilities p 1( p 2 , . . . , p n must satisfy two 
conditions: 

1. All pi are positive: 

Pi > (1.1) 



2. The sum of all the pi equals 1: 

Pl+P2 + -..+Pn=l (1-2) 



The latter condition requires that in each trial, £ must 
necessarily assume one of the listed values. 
The number 

n 

M£ = Y, X iPi (!- 3 ) 

«=1 

is called the mathematical expectation, or the expected 
value, of the random variable £. 



•In probability theory discrete random variables that can 
assume an infinite sequence of values are also considered. 
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To elucidate the physical meaning of this value, 
we rewrite it in the following form: 



M£ = ^ 



Eft 

t=i 

From this relation, we see that M£ is the average 
value of the variable f , in which more probable val- 
ues are included with larger weights. (Averaging with 
weights is of course very common in science. In me- 
chanics, for example, if masses mi, m 2 , . . . , m„ are lo- 
cated at the points an, X2, . ■ ■ , x n on the x axis, then 
the center of gravity of this system is given by the 
equation 

n 

J2 X i m i 
» = 1 

x = — 

t = l 

Of course, in this case the sum of all the masses does 
not necessarily equal one.) 

Let us mention the basic properties of the expected 
value. If c is an arbitrary nonrandom number, then 

M(£ + c) = M£ + c (1.4) 

and 

M(c£) = cMC (1.5) 

If £ and 77 are two arbitrary random variables, then 
M(£ + 17) = M£ + M17 (1.6) 

The number 

DC = M(K-M£] 2 ) (1.7) 

is called the variance of the random variable £. Thus, 
the variance is the expected value of the squared de- 
viation of the random variable £ from its average value 
M£. Obviously, D£ is always greater than zero. 
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The expected value and the variance are the most 
important numerical characteristics of the random 
variable £. What is their practical value? 

If we observe the variable f many times and ob- 
tain the values £ x , £ 2 , • • • , £,n (each of which is equal to 
one of the numbers xi, x 2 , ■ ■ ■ , x n ), then the arithmetic 
mean of these values is close to M£: 

^(&+6 + .-- + &v)«M£ (1.8) 

and the variance D£ characterizes the spread of these 
values around the mean value M£. 

Equation 1 .8 is a simple case of the famous law 
of large numbers and can be explained by the follow- 
ing considerations. Assume that among the obtained 
values £i, &i • • • , 6v the number xi occurs k x times, 

the number x 2 occurs k 2 times and the number 

x„ occurs k n times. Then 

N 
y^,$i = X l^l + x 2 k 2 + • • • + x n k n 

J'=l 



Hence, 

j=l 

At large N, the frequency ki/N of the value x, ap- 
proaches its probability p t so that ki/N « p,-. There- 
fore, 

1 N n j n 

Equation 1.7 for the variance can be transformed 
using Equations 1.4, 1.5, and 1.6: 

D£ = M(£ 2 - 2£ • M£ + [M£] 2 ) 

= M(£ 2 )-2M£-M£ + [M£] 2 
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from which it follows that 

D£ = M(£ 2 )-(M£) 2 (1.9) 

Usually the computation of variance by Equation 1.9 
is simpler than by Equation 1.7. 

The variance has the following basic properties. If 
c is an arbitrary nonrandom number, then 

D(£ + c) = D£ (1.10) 

and 

D(cO = c 2 D£ (1.11) 

The concept of independence of random vari- 
ables plays an important role in probability theory. 
Independence is a rather complicated concept, though 
it may be quite clear in the simplest cases. Let us 
suppose that we are simultaneously observing two 
random variables £ and rj. If the distribution of £ 
does not change when we know the value that rj as- 
sumes, then it is natural to consider f independent 

Of J]. 

The following relations hold for independent ran- 
dom variables £ and 77: 

M(£r?) = Mf ■ M77 (1.12) 

and 

D(£ + 77) = D£ + Dr? (1.13) 

Example: Throwing a Die 

Let us consider a random variable x with distri- 
bution specified by the table 

12 3 4 5 6 
1/6 1/6 1/6 1/6 1/6 1/6 

Clearly, >c can assume the values 1, 2, 3, 4, 5, 6, and 
each of these values is equally probable. So, the num- 
ber of pips appearing when a die is thrown can be 
used to calculate k. 
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According to Equation 1 .3 

6 6 6 

and according to Equation 1.9 

Dx = M(x 2 ) - (Mx) 2 = l 2 • J + 2 2 • i + 
+ 6 2 - i - (3.5) 2 = 2.917 



Example: Tossing a Coin 

Let us consider a random variable with distribu- 
tion 

3 4 

1/2 1/2, 

The game of tossing a coin, with the agreement that 
a head counts three points and a tail counts four 
points, can be used to generate 9. Here 

M0 = 3-^+4-i = 3.5 



and 



D0 = 3 2 • \- + 4 2 • 1 - (3.5) 2 = 0.25 



We see that M0 = Mx, but D0 < Dx. This could 
easily have been predicted, since the maximum devi- 
ation of 9 from 3.5 is ±0.5, while for the values of x, 
the spread can reach ±2.5. 

Continuous Random Variables 

Let us assume that some radium is placed on a 
Cartesian plane at the origin. As an atom of ra- 
dium decays, an a-particle is emitted. Its direction 
is described by the angle ip (Figure 1.1). Since, both 
in theory and practice, any direction of emission is 
possible, this random variable can assume any value 
from to 27r. 

We shall call a random variable continuous if it can 
assume any value in a certain interval (a, b). 
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o x 

Fig. 1.1. Random direction. 



y=p(x) 




a a' V b x 
Fig. 1.2. Probability density. 

A continuous random variable £ is denned by spec- 
ifying an interval containing all its possible values, 
and a function p(x) that is called the probability den- 
sity of the random variable £ (or the distribution den- 
sity of £). 

The physical meaning of p(x) is as follows: let 
(a', V) be an arbitrary interval contained in (a, 6) (that 
is, a < a', b' <b). Then the probability that £ falls in 
the interval (a 1 , b') is equal to the integral 

i' 
P{a' < £ < b'} = I p(x) dx (1.14) 

a' 

This integral (1.14) is equal to the shaded area in 
Figure 1.2. 
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The set of values of £ can be any interval. The 
cases a — -oo and/or b = oo are also possible. How- 
ever, the density p(x) must satisfy two conditions 
analogous to conditions (1.1) and (1.2) for discrete 
variables: 

1. The density p(x) is positive inside (a, 6): 

p(x) > (1.15) 



2. The integral of the density p(x) over the whole 

interval (a, b) is equal to 1: 





/ 



p(x)dx=l (1-16) 



The number 





/ xp{x) 



M£= xp(x) dx (1-17) 

a 

is called the expected value of the continuous random 
variable £. 

The expected value has the same meaning as in 
the case of a discrete random variable. Indeed, since 

b 

f xp{x) dx 

MZ=°- 

f p(x) dx 

a 

it can easily be seen that this is the average value of 
£: any value of x from the interval (a, b) enters the 
integral with its weight p(x) dx. 

(In this case we also have an analogous equation 
in mechanics: if the linear density of a rod a < x < b 
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is equal to p(x), then the abscissa of the center of 
gravity is given by 

b 

J xp(x) dx 



b 

J p{x) dx 



b 

Of course, the total mass of the rod / p(x) dx does not 

a 

necessarily equal one.) 

All the Equations 1.4 through 1.13 are valid also 
for continuous random variables. This includes the 
definition of variance 1.7, the Equation 1.9, and all 
the properties of M£ and D£.* 

We will mention just one more formula for the ex- 
pectation of a function of £. Again, let the random 
variable £ have probability density p(x). Consider an 
arbitrary continuous function f(x) and define a ran- 
dom variable rj = /(£). It can be proved that 

6 

M/(0 = J f{x) P {x)dx (1.18) 

a 

(Of course, a similar equation is valid for a discrete 
random variable £ with distribution (T): M/(£) = 

n 

E f(xi)pi-) 

It must be stressed that, in general, M/(f ) ^ /(M£). 
Finally, for a continuous random variable £ and an 
arbitrary value x 

P{£ = x} = 

Therefore, the probability of an equality {£ = x} is 
physically meaningless. Physically meaningful are 



* However, in probability theory more general random 
variables are encountered: where the condition (1.15) is 
weakened to p(x) > o, where the expected value M£ does 
not exist, and where the variance D£ is infinite. 
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Fig. 1.3. Constant density 
(uniform distribution). 

probabilities of falling into a small interval: 

P{x < £ < x + dx} — p(x) dx 

An Important Example 

A random variable 7 defined over the interval < 
x < 1 and having a density p(x) = 1 is said to be 
uniformly distributed over (0, 1) (Figure 1.3). For any 
subinterval (a', 6') within (0, 1), the probability that 
7 lies in (a', 6') is equal to 





I 



p{x) dx = b' — a' 



that is, the length of the subinterval. In particular, if 
we divide (0, 1) into any number of intervals of equal 
length, the probabilities of 7 falling into any of these 
intervals are the same. 

It is easy to calculate that 

1 1 

M^Jx P (x)dx = Jxdx=l/2 
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and 



D7 = / x 2 p(x) dx - (M7) 2 = 1/3 - 1/4 = 1/12 



This random variable 7 will be used frequently below. 

Normal Random Variables 

A normal (or Gaussian) random variable is a ran- 
dom variable £ defined on the whole axis -00 < x < 00 
and having the density 

* ) = dK""(- fe s£) (1 - 19 > 

where a and <r > are real parameters. (The character 
V" in this equation represents a number and not a 
random variable; the use of a Greek letter here is tra- 
ditional. Equation 1.19 can be found on the German 
ten-mark banknote beside the portrait of C. F. Gauss.) 
The parameter a does not affect the shape of the 
curve p(x): varying a results only in displacement of 
the curve along the x axis. However, a change in a 
does change the shape of the curve. Indeed, it is easy 
to see that 

maxp(i) = p(a) = 



r\/27r 

If a decreases, the maxp(i) increases. But, according 
to condition (1.16), the entire area under the curve 
p(x) is equal to 1. Therefore, the curve extends up- 
ward near x = a, but decreases for all sufficiently large 
values of x. Two normal densities corresponding to 
a = 0, with a = 1 and a = 0, with cr = 0.5 are drawn in 
Figure 1.4. (Another normal density can be found in 
Figure 2.6 below.) 

It can be proved that 

M< = a, and DC = <r 2 
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X-3 -2 



Fig. 1.4. Two Gaussian (or normal) 
densities. 

Arbitrary probabilities P{x' < ( < x"} can be easily 
computed using one table containing values of the 
function 



*M = ^/-'*' ! 



dt 



that is usually called the error function. This term is 
sometimes used for other functions (e.g., erf a;), but all 
such functions can easily be transformed into $(x). 
To show this, in accordance with (1.14), we write 



'<*'<«*"> = ^/-(-^) 



dx 



In the integral we make the substitution x — a = at. 
This produces 



12 

P{x'<<;<x"}=-±=jexp(-t 2 /2) 



dt 



where ti = (x' - a)/ a and t 2 = (x" - a)/ a. Hence, 
V{x'<(<x"}= l -{*{t 2 )-${t 1 )) 

Tables of $(x) contain only positive values of x since 

$(— x) = -$(x). 
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Normal random variables are encountered in a wide 
variety of problems. (The reason for this will be ex- 
plained later in this chapter in a discussion of the 
central limit theorem.) Here we shall consider two 
different approaches to estimating the deviations of £ 
from a = MC- 

the rule of "three sigmas" 

Assume that x' = a - 3<r, x" = a + 3<r. Then ti = -3, 
t 2 = 3, and from the last formula it follows that 

P{a-3cr<C<a + 3(T} = $(3) = 0.997 (1.20) 

The probability 0.997 is so near to 1 that often 
Equation 1.20 is given the following interpretation: 
for a single trial it is practically impossible to obtain 
a value of ( differing from MC by more than 3<r. 

the probable error 

Consider now x' = a-r, x" = a + r where the quan- 
tity r is defined as 

r = 0.6745(7 

Then t x = -0.6745, t 2 = 0.6745 and 

P{a-r<C<a + r}~ $(0.6745) = 0.500 

This relation can be rewritten in the form 

P{|C-a|<r} = 0.5 

Hence, the probability of the opposite inequality is 
also 0.5: 

P{|C-a|>r} = 0.5 

(of course, P{|< - a\ = r} = since ( is a continuous 
random variable). 

The last two relations show that values of ( deviat- 
ing from a by more than r and values deviating from 
a by less than r are equally probable. Therefore, r is 
called the probable error of C,. 
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Example: Error of Measurement 

The error i in an experimental measurement is 
usually a normal random variable. If there is no sys- 
tematic error (bias) then a — M6 = 0. But the quantity 
a = %/D6, called the standard deviation of 8, is always 
positive, and characterizes the error of the method of 
measurement (the precision of the method). 

For a single measurement the absolute error \6\ 
as a rule does not exceed 3<r. The probable error 
r = 0.6745cr shows the order of magnitude of |6| that 
can be both smaller or larger than r. 

P{|^| < r} = P{|6| > r} = 0.5 

The Central Limit Theorem 
of Probability Theory 

This remarkable theorem was first formulated by 
P. S. Laplace. Many outstanding mathematicians, in- 
cluding the Russians P. L. Chebyshev, A. A. Markov, 
A. M. Lyapunov, and A. Ya. Khinchin, have worked 
on various generalizations of the original theorem. All 
the proofs are rather complex. 

Let us consider N identical independent random 
variables ^ , £ 2 , • • • , 6v , so that their probability distri- 
butions coincide. Consequently, their mathematical 
expectations and variances also coincide (we assume 
that they are finite). The random variables can be 
continuous or discrete. 

Let us designate 

M£i = M£ 2 = . . . = M£y = rn 

D6 = D6 = = D6v = b 2 
Denote the sum of all these variables by p N : 

pn - 6 + 6 + • • ■ + 6v 
From Equations 1.6 and 1.13 it follows that 

M PN = M(£i + & + . . . +$ N ) = Nm 
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and 

T> PN = D(6 +6 + • • • + 6r) = Nb 2 

Now let us consider a normal random variable ( N 

with the same parameters: a = Nm, a = b\fN. Its 

density is denoted p N (x). The central limit theorem 

states that for any interval (V, x") and for all large N 

x" 

P{x' < pw < x"} « / pn(x) dx 

x' 

The physical meaning of this theorem is clear: the 
sum p N of a large number of identical independent 
random variables is approximately normal. Actually, 
this theorem is valid under much weaker conditions: 
the variables £1,62, • • • , 6v should not necessarily be 
identical and independent; essentially, all that is re- 
quired is that individual variables fy do not play too 
great a role in the sum. 

It is this theorem that explains why normal ran- 
dom variables are so often encountered in nature. 
Indeed, whenever we meet an aggregate effect of a 
large number of small random factors, the resulting 
random variable is normal. For example, the scatter- 
ing of artillery shells is almost always normal, since 
it depends on weather conditions in all the various 
regions of the trajectory as well as on many other 
factors. 

The General Scheme 

of the Monte Carlo Method 

Suppose that we need to calculate some unknown 
quantity m. Let us try to find a random variable £ with 
M£ = m. Assume that the variance of £ is D£ = b 2 . 

Consider TV independent random variables £1, £2, 
...,£n with distributions identical to that of f . If TV 
is sufficiently large, then it follows from the central 
limit theorem that the distribution of the sum 

PN = 6 + ^2 + • • ■ + £n 
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will be approximately normal, with a = Nm and a = 
b\/~N. According to the rule of "three sigmas" (1.20) 

P{Nm - 3bVN <p N < Nm + 3by/N} « 0.997 

If we divide the inequality within the parentheses by 
N, we obtain an equivalent inequality, whose proba- 
bility remains the same: 



P{m - -= < ^f < m + -==} 



0.997 



We can rewrite the last expression in a somewhat 
different form: 



1 N 

N^ j 



— m 



< 



36 
y/N 



0.997 



(1.21) 



This is an extremely important relation for the Monte 
Carlo method, giving us both the method for calcu- 
lating m and the error estimate. 

Indeed, we have to find TV values of the random 
variable £ — selecting one value of each of the vari- 
ables £i, £ 2 , • • • , £/v is equivalent to selecting N values 
of £, since all these variables have identical distribu- 
tions. 

From (1.21) it is obvious that the arithmetic mean 
of these values will be approximately equal to m. In 
all likelihood, the error of this approximation does not 
exceed 36/\/^V, and approaches zero as N increases. 

In practical computations, the error bound 36/a//V 
is often loose, and it is more convenient to use the 
probable error 

r N = 0.67456/^ 

However, this is not a bound — this is a characteristic 
of the absolute error 

1 N 
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generating random 
variables on a computer 

Sometimes the problem statement of generating 
random variables on a computer provokes the ques- 
tion: "Everything the machine does must be pro- 
grammed beforehand, so where can randomness come 
from?" There are, indeed, certain difficulties associ- 
ated with this point, but they belong more to philos- 
ophy than to mathematics, and we will not consider 
them here. 

We will only stress that random variables are ideal 
mathematical concepts. Whether natural phenomena 
can actually be described by means of these variables 
can only be ascertained experimentally. Such a de- 
scription is always approximate. Moreover, a random 
variable that satisfactorily describes a physical quan- 
tity in one type of phenomenon may prove unsatis- 
factory when used to describe the same quantity in 
other phenomena. Analogously on a national map a 
road may be depicted as a straight line, whereas on 
a local city map the same road must be drawn as a 
twisted band. 

Usually, three means for obtaining random vari- 
ables are considered: tables of random numbers, ran- 
dom number generators, and the pseudorandom num- 
ber method. We will discuss each. 

Tables of Random Numbers 

Let us perform the following experiment. We mark 
the digits 0, 1, 2, . . . , 9 on ten identical slips of paper, 
place them in a hat, mix them, and take one out; 
then return it and mix again. We write down the 
digits obtained in this way in a table like Table 1.1 
(the digits in Table 1 . 1 are arranged in groups of five 
for convenience). 

Such a table is usually called a table of random 
numbers, though random digits would be a better 
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Table 1.1. 400 Random Digits 



86515 90795 66155 66434 56558 12332 94377 57802 

69186 03393 42502 99224 88955 53758 91641 18867 

41686 42163 85181 38967 33181 72664 53807 00607 

86522 47171 88059 89342 67248 09082 12311 90316 

72587 93000 89688 78416 27589 99528 14480 50961 

52452 42499 33346 83935 79130 90410 45420 77757 

76773 97526 27256 66447 25731 37525 16287 66181 

04825 82134 80317 75120 45904 75601 70492 10274 

87113 84778 45863 24520 19976 04925 07824 76044 

84754 57616 38132 64294 15218 49286 89571 42903 



term. This table can be put into a computer's mem- 
ory. Then, when performing a calculation, if we re- 
quire values of a random variable e with the distribu- 
tion 

£ ~(° l 2 ••• M (1.22) 

Vo.i o.i o.i ... o.iy K ' 

then we need only take the next digit from this table. 

The largest of all published tables of random num- 
bers contains one million random digits (see RAND 
Corporation). 2 Of course, it was compiled with the 
aid of technical equipment more sophisticated than 
a hat: a special electronic roulette wheel was con- 
structed. Figure 1.5 shows an elementary scheme of 
such a roulette wheel. 

It should be noted that a good table of random 
numbers is not as easy to compile as it may initially 
appear. Any real physical device produces random 
numbers with distributions that differ slightly from 
ideal distributions (1.22); in addition, experimental 
errors may occur (for example, a slip of paper might 
stick to the hat's lining). Therefore, compiled tables 
are carefully examined, using special statistical tests, 
to check whether any properties of the group of num- 
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Fig. 1.5. Roulette wheel for gener- 
ating random digits (scheme). 

bers contradict the hypothesis that these numbers 
are independent values of the random variable e. 

Let us examine the simplest, and, at the same 
time, most important tests. Consider a table con- 
taining N digits ei, e 2 , . . . , e N . Denote the number of 
zeros in this table by v , the number of ones by v\, 
the number of twos by z/ 2 . and so on. Consider the 
sum 

f ( N " 

Probability theory enables us to predict the range in 
which this sum should lie; its value should not be 
too large, since the expected value of each of the v t 
is equal to N/10, but neither should it be too small, 
since that would indicate a "too regular" distribution 
of the numbers. ("Too regularly" distributed numbers 
facilitate certain computations, known as quasi-Monte 
Carlo methods. But these numbers cannot be used 
as general purpose random numbers.) 

Assume now that the number N is even, N = 2N', 
and consider pairs (ei, e 2 ), (£3, e*),---, (ejv-i, £ n)- 
Denote by v^ the number of pairs equal to (i, j) and 
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calculate the sum 

i=0 j=0 x 

Again, probability theory predicts the range in which 
this sum should lie, and thus we can test the distri- 
bution of pairs. (Similarly, we may test the distribu- 
tion of triplets, quadruplets, etc.) 

However, tables of random numbers are used only 
for Monte Carlo calculations performed by hand. 
Computers cannot store such large tables in their 
small internal memories, and storing such tables in 
a computer's external memory considerably slows cal- 
culations. 

Generators of Random Numbers 

It would seem that a roulette wheel could be cou- 
pled to a computer, in order to generate random num- 
bers as needed. However, because any such mechan- 
ical device would be too slow for a computer, vacuum 
tube noise is usually proposed as a source of random- 
ness. For example, in Figure 1.6, the noise level E is 
monitored; if, within some fixed time interval At, the 
noise exceeds a given threshold E an even number 
of times, then a zero is recorded; if the noise exceeds 
E an odd number of times, a one is recorded. (More 
sophisticated devices also exist.) 

At first glance this appears to be a very convenient 
method. Let m such generators work in parallel, all 
the time, and send random zeros and ones into a 
particular address in RAM. At any moment the com- 
puter can refer to this cell and take from it a value of 
the random variable 7 distributed uniformly over the 
interval < x < 1. This value is, of course, approx- 
imate, being an m-digit binary fraction of the form 
0.aia 2 ...a m , where each a, simulates a random vari- 
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Fig. 1.6. Random noise for gen- 
erating random bits (scheme). 

able with the distribution 

1 
1/2 1/2 

But this method is not free of defects. First, it is diffi- 
cult to check the "quality" of the numbers produced. 
Tests must be carried out periodically, since any im- 
perfection can lead to a "distribution drift" (that is, 
zeros and ones in some places begin to appear with 
unequal frequencies). Second, it is desirable to be 
able to repeat a calculation on the computer, but im- 
possible to reproduce the same random numbers if 
they are not stored throughout the calculation; as 
discussed earlier, storing so much data is impracti- 
cal. 

Random number generators may prove useful if 
specialized computers are ever designed for solving 
problems by means of the Monte Carlo method. But 
it is simply not economical to install and maintain 
such a special unit in multipurpose computers, in 
which computations involving random numbers are 
performed only occasionally. It is therefore better to 
use pseudorandom numbers. 
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Pseudorandom Numbers 

Since the "quality" of random numbers used for 
computations is checked by special tests, one can ig- 
nore the means by which random numbers are pro- 
duced, as long as they satisfy the tests. We may even 
try to calculate random numbers by a prescribed — 
albeit sophisticated — formula. 

Numbers obtained by a formula that simulate the 
values of the random variable 7 are called pseudo- 
random numbers. The word "simulate" means that 
these numbers satisfy a set of tests just as if they 
were independent values of 7. 

The first algorithm for generating pseudorandom 
numbers, the mid-square method, was proposed by 
John von Neumann. We illustrate it with an example. 

Suppose we are given a four-digit number 70 = 
0.9876. We square it and obtain an eight-digit number 
jl = 0.97535376. Then we take out the middle four 
digits of this number and get 71 = 0.5353. 

Now we square ^ and obtain -)\ = 0.28654609. Once 
more we take out the middle four digits, and get j 2 = 
0.6546. 

Then we obtain T | = 0.42850116, 73 = 0.8501; 7! = 
0.72267001, 74 = 0.2670; 7! = 0.07128900, 75 = 0.1289; and 
so on. 

Unfortunately, this algorithm tends to produce a 
disproportionate frequency of small numbers; how- 
ever, other, better, algorithms have been discovered 
— these will be discussed in Chapter 3 under On 
Pseudorandom Numbers. 

The advantages of the pseudorandom numbers 
method are evident. First, obtaining each number 
requires only a few simple operations, so the speed 
of generating numbers is of the same order as the 
computer's work speed. Second, the program occu- 
pies only a few addresses in RAM. Third, any of the 
numbers y k can be reproduced easily. And finally, the 
"quality" of this sequence need be checked only once; 
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after that, it can be used many times in calculations 
of similar problems without taking any risk. 

The single shortcoming of the method is the lim- 
ited supply of pseudorandom numbers that it gives, 
since if the sequence of numbers 70 , y x , . . . , y kl ... is 
computed by an algorithm of the form 

7k = F(7fc_i) 

it must be periodic. (Indeed, in any address in RAM 
only a finite number of different numbers can be writ- 
ten. Thus, sooner or later one of the numbers, say 
7i, will coincide with one of the preceding numbers, 
say 7;. Then clearly, y L+1 = y l+1 , y L+2 = y, +2 , ..., so 
that there is a period P = L — l. The nonperiodic part 
of the sequence is Tl , y 2 , ...,il-i-) 

A large majority of computations performed by the 
Monte Carlo method use sequences of pseudorandom 
numbers whose periods exceed current requirements. 

A Remark 

One must be careful: there are papers, books, and 
software advertizing inadequate methods for gener- 
ating pseudorandom numbers, these numbers being 
checked only with the simplest tests. Attempts to 
use such numbers for solving more complex prob- 
lems lead to biased results (see On Pseudo Random 
Numbers in Chapter 3). 



transformations of random variables 

In the early stage of application of the Monte Carlo 
method, some users tried to construct a special 
roulette for each random variable. For example, a 
"roulette wheel" disk divided into unequal parts (pro- 
portional to p^, shown in Figure 1.7, can be used to 
generate values of a random variable with the distri- 
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Fig. 1.7. Roulette wheel for 
generating values of a discrete 
random variable. 



X\ X 2 X 3 £4 

0.5 0.25 0.125 0.125 



However, this proves unnecessary: values of any 
random variable can be obtained by transformations 
of values of one "standard" random variable. Usually 
this role is played by 7, which is uniformly distributed 
over the interval < x < 1. We already know how to 
get the values of 7; "the process of finding a value of 
some random variable £, accomplished by transform- 
ing one or more values of 7, we call the modeling of £. 



Modeling of a Discrete Random Variable 

Assume that we want to obtain values of a random 
variable £ with the distribution 



£ 



Xi 

Pi 



X2 

Pi 



Pn 



Consider the interval < y < 1, and break it up into 
n intervals with lengths equal to p lt p 2 , ..., p n . The 
coordinates of the points of division will obviously be 

y = pi. y - P1+P2, y - pi + P2 + Ps, ■■■, y = P1+P2 + 
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Fig. 1.8. Partition of the 
interval < y < 1 for 
modeling of a discrete 
random variable. 

. . . + p„_i. These intervals are numbered 1, 2, . . . , n 
(Figure 1.8), and this completes the preparation for 
modeling £. Each time we "perform an experiment" 
in order to find a value of £, we select a value of j and 
fix the point y = j. If this point falls into the interval 
numbered i, we consider that f = x t . 

It is easy to demonstrate the validity of this pro- 
cedure. Since j is uniformly distributed in the unit 
interval, the probability of 7 falling into any subinter- 
val is equal to the length of that subinterval. Hence, 

P{0<7<pi}=pi 

P{pi <7 <pi +p 2 } =P2 



P{pi + ...+P„-l <7< l}=Pn 
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According to our procedure, £ = z t if 

pi + . . . + p 8 _i < 7 < pi + . . . + pi 

and the probability of this event is p, . 

In practice, the definition of i can be carried out 
with the aid of comparisons: is 7 less than p x ? Oth- 
erwise, is 7 less than pi + p 2 ? Otherwise, is 7 less 
than pi + p 2 + P3? And so on. 

Note that the order of the values x x , x 2 , . ... , x n in 
the distribution of f can be arbitrary, but it must be 
fixed before the modeling. 

equiprobable values 

The previous modeling procedure is much simpler 
when all the probabilities p 8 are equal: p x = p 2 = . . . = 
p n = i/ n ; that is, when 

/ Xi x 2 ... x„ 
*~\l/n 1/n ... 1/n 

According to our procedure, £ = n if (i- l)/n < 7 < i/n 
or, in other words, i — 1 < 717 < i. The last relation 
means that the integral part of nj is equal to i - 1. 

The integral part of z is usually denoted [z\. So, 
we obtain a direct formula for modeling ^: 

£ = ar,- where i = [07] + 1 

numerical example 

Find 10 values of the random variable x consid- 
ered in the discussion of throwing a die, whose pos- 
sible values are x { = i, 1 < i < 6. As values of 7, we 
select groups of three random digits, from Table 1, 
multiplied by 0.001. So, 7 = 0.865; 0.159; 0.079; 0.566; 
0.155; 0.664; 0.345; 0.655; 0.812; 0.332. The correspond- 
ing values x = 1 + [67] are 6; 1; 1; 4; 1; 4; 3; 4; 5; 2. 
They may be regarded as ten throws of a die. 
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Modeling of a Continuous Random Variable 

Now let us assume that we want to obtain values 
of a random variable £ distributed over the interval 
a < x < b with density p(x). We shall prove that 
values of £ can be found from the equation 



/ 



p(x)dx = j (1.23) 



That is, selecting a consecutive value of 7, we must 
solve Equation 1.23 to find the corresponding value 
of£. 

For the proof let us consider the function 



X 

= J p( x ) 



dx 



From the general properties of density (1.15) and 
(1.16), it follows that 

y{a) = 0, y(b) = 1 
and the derivative 

y'(x) = p(x) > 

This means that the function y(x) increases mono- 
tonically from to 1 (Figure 1.9). Any straight line 
y = j, where < 7 < 1, intersects the curve y = y(x) 
in one, and only one, point, whose abscissa is taken 
for the value of £. Thus Equation 1.23 always has a 
unique solution. 

Now we select an arbitrary interval (a', V) con- 
tained in (a, b). The ordinates of the curve y — y{x) 
satisfying the inequality 

y{a') <y< y{b') 

correspond to the points of this interval 

a' < x < b' 
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a. | b x 

Fig. 1.9. The function 
y = Jp(x)dx for model- 

a 

ing of a continuous ran- 
dom variable. 




Fig. 1.10. One-to-one map- 
ping of the interval a' < x <b' 
onto y(a') <y< y(b'). 



Consequently, if £ belongs to a' < x < b', then j be- 
longs to y(a') <y < y(b'), and vice versa {Figure 1.10). 
Hence, 

P{a' < £ < b'} = P{y(a') < 7 < y(&')> 
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Since 7 is uniformly distributed over < y < 1, 



b' 



>{y(a') < 7 < y(b')} = y{b') - y(a') = f p(x) dx 



Therefore, 




P{a'<£<6'}= I p{x)dx 



and this means that the random variable £, which is 
a root of Equation 1.23, has the probability density 

p(x). 

Example 

The random variable 77 is said to be uniformly dis- 
tributed over the interval (a, 6) if its density is con- 
stant over this interval: 

p(x) = for all a < x < b 

b — a 

To model values of 77, we apply Equation 1.23: 

v 

/dx 
b—-a=" 

a 

The integral is easily computed: 

T) — a __ 
b-a _T 

Hence, we arrive at an explicit expression 

v - a + 7(6 - a) (1.24) 

that is a linear transformation of 7. 

Other examples of the application of Equation 1.23 
are given in Chapter 2 (see The Simple Flow of Re- 
quests and A Numerical Example). 
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von Neumann's Method for Modeling 
a Continuous Random Variable 

It may happen that Equation 1.23 is difficult to 
solve for£ (for example, when the integral of p(x) can- 
not be expressed in terms of elementary functions, or 
when the density p{x) is given graphically). 

Let us assume that the random variable £ is de- 
fined in a finite interval a < x < 6, and its density is 
bounded (Figure 1.11): 

p(x) < Mo 

The value of £ can be found in the following way: 

1. We select two random numbers 7' and j" and 
construct a random point T(r)', 77") with coordi- 
nates 

1/ = a + j'(b- a ), n" = y"M 

2. If the point T lies under the curve y = p(x), we 
set £ = t)'; if the point T lies above the curve, 
we reject the pair (7', 7") and select a new pair 

(7', !")■ 

The validity of this method can be demonstrated 
easily. First, since rf is uniformly distributed in (a, b), 
the probability that r will be in the strip (x, x + dx) 
is proportional to dx. Second, since 7/" is uniformly 
distributed in (0, M ), the probability that T will not 
be rejected is equal p(x)/M Q and so proportional to 
p(x). Hence, the probability that an accepted value 
£ = t}' belongs to the interval (x, x+dx) is proportional 
to p(x) dx. 

On Modeling Normal Variables 

There are alternative techniques for modeling var- 
ious random variables. These techniques are usually 
applied only when the methods for modeling discrete 
and continuous random variables described earlier 
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Fig. 1.11. von Neumann's method: £ = if. 

prove ineffective (see On Methods for Generating Ran- 
dom Variables in Chapter 3). 

In particular, these alternative techniques are ap- 
plied in the case of a normal random variable £, since 
the equation 



*=y«p(-£)<fe= 7 



a/2^ 



is not explicitly solvable. 

For computations by hand, one can use Table 1.2, 
in which values are given for a normal random vari- 
able £ with mathematical expectation M( = and 
variance D< = 1. It is not hard to verify that the ran- 
dom variable 

C = a + a<; (1.25) 

will also be normal; as follows from Equations 1.4, 
1.5, 1.10, and 1.11 

MC' = a, DC' = <r 2 

Thus, Equation 1.25 enables us to model arbitrary 
normal variables with the help of Table 1.2. 
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Table 1.2. 88 Normal Values with a = 0, a = 1 



0.2005 1.1922-0.0077 0.0348 1.0423-1.8149 1.1803 

0.0033 1.1609-0.6690-1.5893 0.5816 1.8818 0.7390 

-0.2736 1.0828 0.5864-0.9245 0.0904 1.5068-1.1147 

0.2776 0.1012-1.3566 0.1425-0.2863 1.2809 0.4043 

0.6379-0.4428-2.3006-0.6446 0.1516-1.7708 2.8854 

0.4686 1.4664 1.6852-0.9690-0.0831-0.5863 0.8574 

-0.5557 0.8115-0.2676-1.2496-1.2125 1.3846 1.1572 

0.9990-0.1032 0.5405-0.6022 0.0093 0.2119-1.4647 

-0.4428 -0.5564 -0.5098-1.1929 -0.0572 -0.5061 -0.1557 

-1.2384-0.3924 1.7981 0.6141-1.3596 1.4943-0.4406 

-0.2033-0.1316 0.8319 0.4270-0.8888 0.4167-0.8513 

1.1054 1.2237-0.7003 0.9780-0.7679 0.8960 0.5154 

-0.7165 0.8563-1.1630 1.8800 



Again About the Hit-or-Miss Example 

Now It is possible to explain how the random points 
in Figures 1 and 2 are generated. The points in Fig- 
ure 1 have coordinates 

x = j', y = y" 

The values of 7' and 7" are computed from groups of 
five digits from Table 1.1: Xl = 0.86515; yi = 0.90795; 
x 2 = 0.66155; y 2 = 0.66434; etc. 

Since the abscissas and the ordinates of these 
points are independent and uniformly distributed over 
< x < 1 and < y < 1, it can be proved that the prob- 
ability of one such point falling into any region inside 
the square is equal to the area of the region. In other 
words, these points are uniformly distributed over the 
square < a; < 1, < y < 1. 

The points in Figure 2 have coordinates 

a; = 0.5 + 0.2C', 2/ = 0.5 + 0.2<" 

where the values £' and £" are taken successively from 
Table 1.2: Xl = 0.5 + 0.2 • 0.2005; j/i = 0.5 + 0.2 • 1.1922; 
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x 2 = 0.5 + 0.2 • (-0.0077); y 2 = 0.5 + 0.2 ■ 0.0348; etc. (One 
of the points falls outside the square and is rejected.) 

As follows from (1.25), the abscissas and ordinates 
of these points are normal variables with means a = 
0.5 and variances a 2 = 0.04. 

Let us estimate the computational error of the hit- 
or-miss example. The random value N' is approxi- 
mately normal since N' is a sum 

where £,- = 1 if the j\h point falls inside S, and £,• = 
if otherwise. All these £j are independent and have a 
common distribution 

here the area of S is denoted by the same letter S. 
Thus, M£ = S, M£ 2 = S, D£ = S - S 2 . The variance of 
our estimate N'/N is equal to 

B(N'/N) = BZ/N = 5(1 - S)/N 

and its probable error is approximately 0.6745 x 
y/S(l - S)/N. At 5 = 0.35, N = 40 we obtain the value 
0.051 which is in close agreement with the actual error 
of the computation (0.05). 
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examples of the 

application of the 

monte carlo method 



simulation of a mass-servicing system 

Description of the Problem 

Consider a simple servicing system that has n 
"lines" (or "channels", or "distribution points") per- 
forming a set of operations that we will call "servic- 
ing". The system receives requests arriving at random 
moments: 7\ < T 2 < . . . < T k < 

Let T k be the moment of arrival of the kth request. 
If line 1 is free at t = T k , it starts servicing the re- 
quest; this takes t h minutes (</, is the holding time of 
the line). If line 1 is busy at t = T k , the request is 
immediately transferred to line 2. And so on. . . 

Finally, if all n lines are busy at instant T k , the 
system rejects the request. 

The problem is to determine how many requests 
(on average) the system satisfies during an interval of 



time T; how many rejections will be given? 

It is clear that problems of this type are encoun- 
tered in a variety of organizations ranging from those 
that provide simple services to those that require high- 
ly specialized and complex logistical coordination. In 
complex situations, like those we shall describe later, 
the Monte Carlo method turns out to be the only pos- 
sible method of computation; it is rare that an ana- 
lytical solution is found. 

The Simple Flow of Requests 

The first question that arises when such servicing 
systems are analyzed is: what is the mathematical 
model of the flow of incoming requests? This question 
is usually answered by experimental observations of 
similar systems over long periods of time. Investiga- 
tion of request flows under various conditions permits 
us to single out some frequently encountered models. 

A simple flow (or a Poisson flow) is a sequence of 
events in which the interval between any two consec- 
utive events is an independent random variable with 
density 

p(x) = ae~ ax , < x < oo (2.1) 

This density (2.1) is also called the exponential dis- 
tribution (see Figure 2.1, in which two densities for 
a = 1 and a = 2 are constructed). 

It is easy to compute the mathematical expectation 
of a random variable r with density (2.1): 

oo oo 

Mr = I xp(x)dx= xae~ ax dx 
o o 

Integration by parts (u = x, dv = ae~ ax dx) yields 



OCJ 

Mt= [-xe- ax ]™ + f e~ ax dx 



Jo 



The parameter a is called the request flow density. 
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1 2 X 

Fig. 2.1. Two exponential densities. 

The formula for modeling r is easily obtained from 
Equation 1.23, which in our case is written: 



T 

!> 



dx = 7 



Computing the integral on the left, we get the relation 



1 - e~ aT = 7 



from which, in turn, we get 



t = ln(l — 7) 

a 

However, the variable 1 - 7 has exactly the same 
distribution as 7, and so, instead of this last equation, 
we can use the equation 



--In 7 
a 



(2.2) 



The Computation Plan 

Let us consider the functioning of a system in the 
case of a simple flow of requests. To each of the n 
lines we assign one address in RAM, in which we 
register the moment when this line becomes free. We 
denote the moment at which the ith line becomes free 
by t { . 
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We begin the calculation at the time when the first 
request enters the system, so 71 = 0. At this point 
all the lines are free: t 1 = t 2 — ■ ■ ■ = t n = 0. The 
calculation will be finished at time Tf = Ti + T. 

The first request enters line 1; this line is now 
busy for the period t h , and we must replace t x by a 
new value (ti) new = T\+t h , add one to the counter 
of serviced requests, and turn to examine the second 
request. 

Let us now assume that k requests have already 
been considered. It is necessary to select the time of 
arrival of the (k + l)th request. For this we generate 
the next value of 7 and compute the next value r = r k 
using Equation 2.2. Then the entrance time 

Tk+i = Tk + Tk 

Is the first line free at this time? To find out, it is 
necessary to check the condition 

*i < T k +i (2.3) 

If this condition is met, it means that at time T k+ i the 
line is free and can service the request. We therefore 
replace t x by Tk+i + <h, add one to the counter, and 
turn to the next request. 

If condition (2.3) is not satisfied, it means that at 
moment Tk+i, the first line is busy. Then we must 
test whether the second line is free: 

h < T k+1 (2.4) 

If condition (2.4) is met, we replace t 2 by Tk+i + th, 
add one to the counter, and turn to the next request. 
If condition (2.4) is not satisfied, we proceed to test 
the condition 

t 3 < Tjfc+i 

It can happen that for all i from 1 to n, 

ti > Tk + 1 
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that is, all lines are busy at time T k +\. We then add 
one to the counter of rejections and turn to the next 
request. 

Each time Tt+i is computed, we have to check the 
condition for termination of the experiment: Tk+i > 
Tf. Once this condition is satisfied, the experiment 
comes to an end. The counters give us the number 
of satisfied requests fi sat and the number of rejected 
requests fi re j- 

This experiment must be repeated N times (each 
time with different values of 7). Then the results of 
all the trials are averaged: 

1 N 

M/Z s „( « — y y fisat, j 
J' = l 

and 

1 N 

j = l 

where the values fi sa t,j and Hrejj sre the values of 
H sat and n re j obtained in the j\h trial. 

More Complex Problems 

It is easy to see that we can use the same method 
to compute results for much more complex systems. 
For example, the holding time t h may be random and 
different for the various lines (this would correspond 
to the use of different servicing equipment or to the 
employment of service staff having differing qualifi- 
cations). The scheme of computations remains un- 
changed, but the values of t h must be modeled for 
each request, and the modeling formulas must be 
specific for each line. 

We can consider so-called systems with waiting, 
which do not overflow immediately; a request is stored 
for some time (its waiting time in the system) and is 
not rejected if any line becomes available during that 
time. 
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Systems can also be considered in which the next 
request is serviced by the first line that becomes free. 
We can take into account a random failure of each 
line and a random time interval necessary to repair 
it. It is possible to allow for variations in the density 
of the request flow over time, and in many additional 
factors. 

Of course, a price has to be paid for such simu- 
lations. Useful results are obtained only if a sound 
model is chosen. This requires careful study of ac- 
tual request flows, timings of the work at the various 
distribution points, and so forth. 

Generally, we must first ascertain the probabilistic 
laws governing the functioning of the various parts of 
the system. Then the Monte Carlo method permits 
the computation of the probabilistic laws of the entire 
system, however complex it may be. 

Such methods of calculations are extremely help- 
ful in planning enterprises. Instead of an expensive 
(and sometimes impossible) real experiment, we can 
carry out experiments on a computer, trying out var- 
ious versions of job organization and equipment us- 
age. 



calculating the quality 
and reliability of devices 

The Simplest Scheme for Estimating Quality 

Let us consider a device made up of a certain (pos- 
sibly large) number of elements. For example, this 
device may be a piece of electrical equipment, made 
of resistors (Rk), capacitors (C*), tubes, and the like. 
We assume that the quality of the device is deter- 
mined by the value of an output parameter U, which 
can be computed from the parameters of all the ele- 
ments: 

U = f(R lt R 2 ,...; d,C 2 , ...;...) (2.5) 

40 examples of application ofmonte carlo method 



B3P 
Z2.KJB5X2W 



Fig. 2.2. A resistor witn re- 
sistance 22 jfefi ± 5%. B3P is 
the Russian trademark. 

If, for example, U is the voltage in an operating section 
of an electric circuit, then by Ohm's law it is possible 
to write equations for the circuit and, solving them, 
to find U. 

In reality the parameters of the elements are never 
exactly equal to their indicated values. For instance, 
the resistor illustrated in Figure 2.2 can "test out" 
anywhere between 20.9 and 23.1 kiloohms (B3P is 
the Russian trademark). The question here is: what 
effect do deviations of the parameters of all these el- 
ements have on the value of U? 

We can try to compute the limits of variation of U, 
taking the "worst" values of the parameters of each 
element. However, it is not always clear which set 
of parameter values are the worst. Furthermore, if 
the total number of elements is large, the limits thus 
computed are probably highly overestimated: it is 
very unlikely that all the parameters are simultane- 
ously functioning at their worst. 

It is, therefore, more reasonable to consider the pa- 
rameters of all the elements, and the value U itself, 
to be random, and to try to estimate the mathemati- 
cal expectation MU and the variance DU. The value 
MU is the mean value of U for the aggregate of de- 
vices, and DC/ will show what deviations of U from 
MU will be encountered in practice. Recall from our 
discussion of continuous random variables in Chap- 
ter 1 that, in general, 

MU?f(MR lt MR2, ...; MCi, MC 2l ...;...) 
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The distribution of U cannot be computed analyt- 
ically if the function / is even slightly complex; how- 
ever, sometimes this can be estimated experimentally 
by examining a large lot of manufactured devices. But 
such examination is not always possible, and cer- 
tainly not in the design stage. 

Let us try to apply the Monte Carlo method to 
this problem. To do so, we need to know: (1) the 
probabilistic characteristics of all the elements, and 
(2) the function / (more exactly, we must know how 
to calculate the value of U from any specified values 
Ri, i?2, • • • ; Ci, C2, . . . ; . . .). 

The probability distributions of the parameters for 
each element can be obtained experimentally by ex- 
amining a large batch of such elements. Quite often 
these distributions are found to be normal. Therefore, 
many investigators proceed in the following way: they 
consider, for example, the resistance of a resistor (pic- 
tured in Figure 2.2) to be a normal random variable p 
with mathematical expectation a = Mp = 22 and with 
3c = 1.1. (According to the rule of three sigmas, it is 
unlikely to get a value of p deviating from Mp by 3<r 
on any one trial.) 

The scheme for the computation is simple: first, 
for each element a random value of its parameter is 
found; then the value of U is computed according 
to Equation 2.5. Having repeated this numerical ex- 
periment N times (with different random numbers), 
we obtain values U\, U 2 , ..., Un and assume that ap- 
proximately 



Mcr«lj> 



m/«- 



/ AT 

1 






For large N in the last formula the factor l/(iV - 1) 
can be replaced by l/N, and then this equation will 
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Fig. 2.3. Scheme of a device corre- 
sponding to Equation 2.6. 




Fig. 2.4. Scheme of a device corre- 
sponding to Equation 2.7. 

be a direct consequence of Equations 1.8 and 1.9. 
In statistics it has been shown that for small N it is 
better to keep the factor l/(N - 1). 

Examples of the Calculation of Reliability 

Suppose we want to estimate how long, on av- 
erage, a device will function without breakdown, as- 
suming that we know the characteristics of the failure- 
proof operation of all its components. 

If we consider the breakdown time of each compo- 
nent t(k) to be a constant, then the computation of 
the breakdown time t of the whole device presents no 
problems. For example, for the device schematically 
represented in Figure 2.3, in which the failure of one 
component implies the failure of the entire device, 



t = min(< ( i); < (2 ); t (3) ; <( 4) ) 



(2.6) 



And for the device in which one of the elements is 
duplicated with a stand-by element, (schematically 
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represented in Figure 2.4), 

t = min (i (1) ; t^y max(i( 3) ; t (4) ); f (5 )), (2.7) 

since if element 3 fails, for example, the device will 
continue to work with the remaining element 4. 

In practice the duration of failure-proof operation 
of any component of a device is a random variable. 
When we say that a light bulb is good for 1 ,000 hours, 
we only mean that 1,000 hours is the average value 
MO of the random variable 9. Everyone knows that 
one bulb may burn out sooner while another will last 
longer. 

If the distribution densities of the breakdown times 
0( fc ) for all the components of a device are known, M0 
can be computed by the Monte Carlo method, fol- 
lowing the plan of the simplest scheme for estimat- 
ing quality. Indeed, for each component we obtain 
a value t^ for the random variable O^y Then, from 
Equations 2.6 or 2.7, we are able to compute a value t 
of the random variable 9, representing the breakdown 
time of the whole device. Repeating this trial enough 
times (JV), we can obtain an approximation 

i=i 

where tj is the value off obtained in the jth numerical 
experiment. 

It should be mentioned that the problem of the 
probability distributions of breakdown times for indi- 
vidual components is not as simple as one may think: 
actual experiments are difficult to perform, since one 
must wait until enough elements have broken down. 

Further Possibilities of the Method 

The preceding examples show that the techniques 
for predicting the quality of devices being designed 
is quite simple in theory. One must know the proba- 
bilistic characteristics of all the components of a given 
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Fig. 2.5. Numbers of values that appeared 
in each interval. 

device, and be able to compute the variable of interest 
as a function of the parameters of these components. 
The randomness of the parameters is taken into ac- 
count by means of our simulation. 

From the simulation it is possible to obtain much 
more information than just the expectation and vari- 
ance of the variable that interests us. Suppose that 
we have found a large number of values Ui, U 2 , ..., U N 
of the variable U. From these values we can construct 
an approximate density of U. (In fact this is a general 
statistical problem of experimental data processing, 
though here we have carried out numerical experi- 
ments.) We are limiting ourselves here to a concrete 
example. 

Let us assume that we have, altogether, 120 val- 
ues Ui, U 2 , .., U120 of the random variable U, all con- 
tained in the range 

1.0 < Uj < 6.5 

Let us divide the interval 1.0 < x < 6.5 into 11 equal 
intervals of length Ax = 0.5 (or any number of in- 
tervals that is neither too large nor too small), and 
count how many values of Uj fall into each interval. 
The results are given in Figure 2.5. 

The frequency of "hits" in any interval is calculated 
by dividing these numbers by N = 120. In our example 
the frequencies are: 0.017; 0.000; 0.008; 0.120; 0.200; 
0.270; 0.140; 0.160; 0.060; 0.008; 0.017. 

On each of the intervals of the partition, let us 
construct a rectangle with area equal to the frequency 
of hitting that interval (Figure 2.6). In other words, 
the height of each rectangle is equal to the frequency 
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Fig. 2.6. Histogram and density. 

divided by Ax. The graph obtained is known as a 
histogram. 

The histogram serves as an approximation to the 
unknown density of the random variable U. There- 
fore, for example, the area of the histogram between 
x = 2.5 and x = 5.5 yields an approximate value of the 
probability 

P{2.5 < U < 5.5} « 0.95 

Hence, we can assume on the basis of our experi- 
ments that, with a probability of approximately 0.95, 
the value of U falls in the interval 2.5 < U < 5.5. 

In Figure 2.6 the density of a normal random vari- 
able (' with the parameters a = 3.85, a = 0.88 has been 
constructed as a comparison. If the probability of £' 
falling inside the interval 2.5 < x < 5.5 is computed 
from this density, we obtain a fairly close value, 0.91. 
Indeed, according to Chapter 1 (see Normal Random 
Variables) 

P{2.5 < C' < 5.5} = 0.5[S(t 2 ) - «(*i)] 
where t x = (2.5 - a)/a = -1.54, t 2 = (5.5 - a) /a - 1.88; 
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therefore, 

P{2.5 < C' < 5.5} = 0.5[$(1.88) + $(1.54)] = 0.91 

A Remark 

It is unfortunate that calculations of this type are 
still fairly scarce — primarily because designers are 
not aware of this method. 

Furthermore, before using the method to evaluate 
a device, one must find out the probabilistic char- 
acteristics of all the components that go into it — 
this entails a lot of work. Once these characteristics 
are known, however, one can evaluate the quality of 
any device that is made of these components. It is 
even possible to estimate variations in quality when 
certain components are replaced by others. 

One might hope that in the near future such cal- 
culations will become routine, and the probabilistic 
characteristics of different elements will be supplied 
by their producers. 

computation of neutron transmission 
through a plate 

The laws of interaction of single elementary par- 
ticles (neutrons, photons, mesons, etc.) with matter 
are known. Usually it is necessary to find out the 
macroscopic characteristics of processes in which an 
enormous number of such particles participate: den- 
sities, fluxes, and so on. This situation is quite sim- 
ilar to the one we encountered in our discussions of 
mass servicing systems and the quality and reliabil- 
ity of devices; it, too, can be handled by the Monte 
Carlo method. 

Nuclear physics is, perhaps, the field in which the 
Monte Carlo method is used most frequently. We wiH 
consider the simplest version of the problem of neu- 
tron transmission through a plate. 
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Fig. 2.7. Problem statement. 

Statement of the Problem 

Let a stream of neutrons with energy E fall on a 
homogeneous infinite plate < x < h, the angle of 
incidence being 90° (Figure 2.7). In collisions with 
the atoms of which the plate is composed, neutrons 
can be either elastically scattered or absorbed. Let 
us assume, for simplicity, that the energy of a neu- 
tron does not change in scattering, and that all di- 
rections of scattered neutrons are equally probable 
(this is sometimes the case for neutron collisions with 
heavy atoms). Figure 2.8 illustrates the possible fates 
of a neutron: (a) neutron passes through the plate, 
(b) neutron is absorbed, and (c) neutron is reflected 
by the plate. 

We are required to estimate the probability p + of 
neutron transmission through the plate, the proba- 
bility p° of a neutron being absorbed by the plate, 
and the probability p~ of neutron reflection from the 
plate. 

Interaction of neutrons with matter is character- 
ized in the case under consideration by two constants 
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Fig. 2.8. Possible fates of 
a neutron: (a) transmission, 
(b) absorption, and (c) reflec- 
tion. 

£ a and £ s , denoting the absorption cross-section and 
the scattering cross-section, respectively. The total 
cross-section in this case is the sum of these two 
cross-sections: 

The physical meaning of the cross-sections is as 
follows: In a collision of a neutron with an atom, the 
probability of absorption is £ a /E, and the probability 
of scattering is £,/£. 

The distance A between two consecutive collisions 
of a neutron is called the free path length. This is 
a random variable; it can assume any positive value 
with probability density 



p(x) = £e' 



■iS 



< x < oo 



This density of A coincides with the density (2.1) 
of the random variable r in a simple flow of events 
(refer to the discussion under The Simple Flow of Re- 
quests earlier in this chapter). Accordingly, we can 
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immediately write the expression for the mean free 
path length 

MA = 1/E 

and an equation for modeling A: 

A = -(1/E) In 7 

There remains the question of how to select the 
random direction of a scattered neutron. Since the 
situation is symmetric with respect to the x axis, the 
direction is completely defined by one angle <p be- 
tween the velocity of a scattered neutron and the 
x axis. It can be proved (see Transformations of the 
Type £ = 0(7) in Chapter 3) that the requirement of 
equal probabilities in each direction is equivalent in 
this case to the requirement that the cosine of this 
angle // = cos ip be uniformly distributed over the in- 
terval -1 < x < 1. Equation 1.24 for a = -1 and 6=1 
yields an expression for modeling random values of 

It = 2 7 - 1 

Simulation of Real Trajectories 

Let us assume that a neutron has undergone its 
kth scattering inside the plate at a point with abscissa 
x k , and afterward began to move in the direction fi k . 
We find the free path length 

A* = -(1/E) In 7 
and compute the abscissa of the next collision 

Xk + l = x k + Afc^jfc 

{Figure 2.9, in which /i k = cos <p k ). 

We check to see if the condition for passing through 
the plate has been met: 
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Fig. 2.9. The free path X k . 

If it has, the computation of the neutron's trajectory 
stops, and a 1 is added to the counter for transmit- 
ted particles. Otherwise, we test the condition for 
reflection: 

Xk+i < 

If this condition is met, the computation of the neu- 
tron's trajectory stops, and a 1 is added to the counter 
for reflected particles. If this condition also fails, that 
is, if < x k +i < h, it means that the neutron has 
undergone its (k + l)th collision inside the plate, and 
we have to determine the fate of the neutron in this 
collision. 

In accordance with the method discussed in Chap- 
ter 1 for modeling a discrete random variable, we se- 
lect the next j and test the condition for absorption: 

7 < S a /S 

If this last inequality holds, the trajectory is termi- 
nated and a 1 is added to the counter for absorbed 
particles. If this last inequality does not hold, we as- 
sume that the neutron has been scattered at a point 
with the abscissa x k+1 . Then we find a new direction 
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Fig. 2.10. Flowchart for simu- 
lating one trajectory. 

of movement 

Hk+i = 27-1 

and repeat the cycle once more (but, of course, with 
new values of 7). 

All the 7 are written without subscripts, since each 
value of 7 is used only once. To compute one lap of 
the trajectory, three values of 7 are needed. 

The initial values for every trajectory are 

x = 0, fi = 1 

After N trajectories have been sampled, we find that 
N + neutrons have gone through the plate, N~ have 
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been reflected from it, and N° have been absorbed. It 
is obvious that the desired probabilities are approxi- 
mately equal to the ratios 

, N+ _ N~ , N° 

p T ss , p « , and p « — 

In Figure 2.10 a flowchart of the program for the 
computation of one trajectory is shown. The sub- 
script k is the collision number along the trajectory. 

This computation procedure, although natural, is 
far from being perfect. In particular, it is difficult 
to estimate the probability p + when it is very small. 
However, this is precisely the case we encounter in 
calculating protection against radiation. 

Fortunately, there exist more sophisticated ver- 
sions of the Monte Carlo method that make these 
computations possible. We shall now briefly discuss 
one of the simplest methods of calculation with the 
help of so-called "weights". 

Computation Scheme Using Weights 
that Replace Absorption 

Let us reconsider the same problem of neutron 
transmission. Let us assume that a "package", con- 
sisting of a large number w of identical neutrons, 
is moving along a single trajectory. For a collision 
at the point with abscissa x\ , the average number of 
neutrons in the package that would be absorbed is 
^o(S a /S), and the average number of neutrons un- 
dergoing scattering would be ti>o (£*/£)• Let us add 
the value w (Y, a /Y,) to the counter of absorbed parti- 
cles, and follow the motion of the scattered package, 
assuming that all the remaining w\ = u; (S s /£) neu- 
trons are scattered in a single direction. 

All the equations given for simulation of real trajec- 
tories remain the same. But the number of neutrons 
in the package after each collision is reduced 

■wk+i = iu;b(£j/£) 
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since a part of the package, containing w t (E a /E) neu- 
trons, will be captured. Now the trajectory cannot be 
terminated by absorption. 

The value Wk is usually called the weight of the 
neutron and, instead of talking about a package con- 
sisting of Wk neutrons, we speak of a single neutron 
with weight Wk- The initial weight w is usually set 
equal to 1. (This is not contrary to what was said 
about a "large package", since all the w k obtained 
while computing a single trajectory contain w as a 
common factor.) 

A flowchart of the program that accomplishes this 
computation is given in Figure 2.11. Clearly, it is 
no more complex than the flowchart in Figure 2.10. 
However, one can prove that the computation of p + 
by the last method is always more precise than the 
computation by simulation. 

Indeed, let us introduce random variables rj and 
rf , equal to the number (weight) of neutrons passing 
through the plate, r\ and rj are obtained by modeling 
one trajectory by the method for simulating real tra- 
jectories, and by the method for computation using 
weights that replace absorption, respectively. 

If wo = 1, it is obvious that 

Mrf = M.T) = p + 

(A rigorous proof of this statement can be found in 
Sobol'. 3 ) 

Since 77 can assume only two values, and 1 , the 
distribution of rj is 

1 

p + 1 — p + 

Taking into account that rf = rj, we easily find that 
D7/ = p+-(p+) 2 . 

It is easy to see that the second variable, rj', can 
assume an infinite sequence of values: w = I, wi, w 2 , 
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Fig. 2.11. Flowchart for comput- 
ing one trajectory using weights. 

. . . , to*, . . . , including the value 0. Therefore its dis- 
tribution is given by the table 



w 


IDl 


w 2 


. . w k . 


.. 


10 


3i 


«2 ■ 


■ ■ ik ■ 


... q 



The probabilities q k need not interest us, since in any 
case the variance of rf can be written as 

oo 

*=i 
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Finally, noticing that w k < 1 and that 

oo 
^2w k q k = Mr]' =p + 



k=i 



we obtain the inequality Dtj' < p + - (p + ) 2 . Hence, 
Dtj' < Djj. 

A Remark 

There are many ways to perform calculations us- 
ing weights; we only emphasize that the Monte Carlo 
method enables us to solve complex problems involv- 
ing elementary particles: the medium may consist 
of different substances and can have any geometri- 
cal shape; the energy of the particles may change in 
each collision. Many other nuclear processes can be 
taken into consideration (for example, the possibility 
of an atom fission by collision of a neutron, and the 
subsequent release of new neutrons). Conditions for 
the initiation and maintenance of a chain reaction 
can be modeled, and so forth (see An Astrophysical 
Problem, which follows next in this chapter). 



an astrophysical problem 

In the next problem the Monte Carlo method is 
again applied for estimating the transport of particles 
through matter. Here, the particles are photons; the 
matter in this case is a spherical plasma cloud. 

Comptonization 

Since the discovery of X-ray and 7-ray bursts of 
cosmic origin, astrophysicists have tried to develop 
models that would explain the observed radiation 
spectra: the power-law energy distribution of such 
spectra has been puzzling, and sometimes called 
"nonthermal". 

The simplest model of a compact X-ray source is 
a cloud of hot plasma with a low-frequency v photon 
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source at the center. The photon's energy hv is in- 
creased due to multiple scattering by hot electrons, 
and it emerges from the cloud as hard X-ray or even 
7-ray radiation. 

In physics, the change of frequency of a photon 
that is scattered by an electron is called the Comp- 
ton effect; the change in the photon spectrum due to 
multiple scattering of photons by thermal electrons is 
called the Comptonization of radiation. This process 
is one of the chief mechanisms for producing hard 
radiation spectra in high-energy astrophysics. Ra- 
diation spectra of various compact sources (neutron 
stars, accretion discs around black holes, quasars, 
galactic nuclei, etc.) may be considered to be pro- 
duced by Comptonization. And the most efficient 
method for modeling such spectra is the Monte Carlo 
method (see Pozdnyakov et al). 4 

(In reality, the low-energy photons are not neces- 
sarily born at a single point: for example, there may 
be a black hole at the center of the cloud. However, it 
was proved that the hard-radiation spectrum result- 
ing from Comptonization is insensitive to the spatial 
distribution of the low-frequency photon source. As 
physicists say, after several scatterings the photon 
forgets its birthplace!) 

Example: A Powerful Source of X- and 7-Rays 

A powerful flux of hard 7-rays coming from the nu- 
cleus of the Seyfert galaxy NGC 4151 was detected in 
1979. Figure 2.12, from the cited paper (see Pozd- 
nyakov et al), 4 is an attempt to model the observed 
spectrum of this flux. The flux F v is proportional 
to the probability density of photons escaping from 
the cloud with energy hv. The dots are experimental 
data; error bars (in some measurements rather large) 
are omitted. 

The histogram in Figure 2.12 is computed by the 
Monte Carlo method. 
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Fig. 2.12. Modeling of the 
X- and 7-ray spectrum of the 
nucleus of the Seyfert galaxy 
NGC 4151. Dots indicate ex- 
perimental data. 

The low-frequency photon source is assumed to 
have a blackbody (Planck) spectrum with radiation 
temperature T r ; the hot electrons in the cloud are 
Maxwellian with temperature T e , and the cloud is a 
sphere of radius r. A satisfactory "fit" is obtained with 
kT r = 0.5 eV, kT e = 2.0 MeV, and r = 0.4 in units of 
photon mean free path with respect to cold electrons. 

From Figure 2.12 one can see that over a wide 
energy range (approximately from hv = 0.001 MeV up 
to hv = 6 MeV), the values of lg F v are proportional to 
lg hv. From the slope of the histogram one can easily 
estimate the so-called spectral index of the power-law 
distribution 

Fy ~ (hu)- 12 

At higher energies (hu > MT e ) there is an exponential 
cutoff of the spectrum. 
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On the Method of Computation 

As we have already mentioned, the methods for 
computing neutron transmission through a plate can 
again be applied in this case. In doing so, the en- 
ergy of each newly born photon must be modeled ac- 
cording to the Planck distribution; the energy of the 
scattering electron must be modeled as well; a mean 
free path for the photon that depends on hv must be 
computed after each scattering; and so forth. We will 
not tire the reader by reproducing here all the equa- 
tions involved in these calculations (see Pozdnyakov 
et al). 4 

It is more interesting that the probabilities on the 
right end of the spectrum in Figure 2.12 are 10 6 times 
smaller than the probabilities on the left end. Clearly, 
these probabilities are not computed by straightfor- 
ward simulation. The weights used are more sophisti- 
cated than those described earlier in the Computation 
Scheme Using Weights that Replace Absorption. For 
a photon having weight Wk, after the jfcth scattering, 
the probability of direct escape L k is first computed. 
Next, a "portion" of the photon with weight wkL k is 
added to the counter of escaping photons, while the 
"remainder'' of the photon with weight Wk+i — Wk(l — 
L k ) experiences a (k + l)th collision inside the cloud. 



evaluating a definite integral 

The problems considered in the preceding sections 
are probabilistic by nature, and to use the Monte 
Carlo method to solve them seems natural. Now we 
will consider a purely mathematical problem: the ap- 
proximate evaluation of a definite integral. 

Since evaluating a definite integral is equivalent 
to calculating an area, the hit-or-miss method dis- 
cussed in the Introduction could be used. However, 
we will present here a more efficient method — one 
that allows us to construct various probabilistic mod- 
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els for solving the problem by the Monte Carlo method. 
In addition, we will indicate how to select better mod- 
els from among all alternatives. 

The Method of Computation 

Let us consider a function g(x) defined over the 
interval a < x < b. Our assignment is to compute 
approximately the integral 





= /*(*) 



dx (2.8) 



(If the integral is improper, we assume that its con- 
vergence is absolute.) 

We begin by selecting an arbitrary distribution 
density p{x) defined over the interval (a, b) — in other 
words, an arbitrary function p(x) satisfying (1.15) and 
(1.16). In addition to the random variable £ (defined 
over the interval (a, b) with density p(x)), we need a 
random variable 

" = g(0/p(t) 
By (1.18), 

b 

Mtj= I 9 -Qp(x)dx = I 
J P[x) 

a 

Now let us consider N independent, identically 
distributed random variables ■q 1 , t) 2 , . ■ ■ , m and apply 
the central limit theorem to their sum. In this case 
Equation 1.21 is written 

N 

I I In I 

0.997 







This last relation means that if we choose N values 

fii £2, • • • 1 £/v. then for sufficiently large iV 
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This form of Equation 1.21 also shows that in all 
likelihood the error of approximation in (2.9) will not 
exceed SCDrj/N) 1 / 2 . 

Importance Sampling 

Again, to compute the integral (2.8), we could use 
any random variable f , denned over the interval (a, b) 
with density p(x) > 0. (In general, the density p(x) may 
vanish inside (a, b) but only in those points where 
g(x) = 0). In any case 

Mt? = M( ff (0/p(0) = ! 

However, the variance D77, and, hence, the estimate of 
this error of approximation depend on what variable 
£ we use, since 

b 

B V = M( V 2 ) -I 2 = f £Q dx - I 2 (2.10) 

J P{x) 

a 

Let us prove that D77 is minimized when the density 
p(x) is proportional to |flr(a;)|. 

We will use an inequality well known in analysis: 

b 2 b b 

( / \u(x)v(x)\ dx ) < / u 2 {x)dx I v 2 (x)dx 



If we set u = g(x)/yjp{x) and v = ^p(x), then we obtain 

b 2 b ■ b 

(J\g(x)\dx^ < j 9 -^dxjp{x)dx 

a a a 

-I'M" 

a 

It follows from (2.10) and (2.11) that 

b 2 

D»7> (I \g(x)\dx) -I 2 (2.12) 

a 
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Now it remains to prove that the lower bound (2.12) 
of the variance D77 is attained when the density is 
proportional to \g(x)\, that is, when p = c\g(x)\. Since 
the density must satisfy (1.16), it follows that 



1 
\g(x)\dx 



Hence, 

b 



J 9 -M d x=- c j\g(x)\dx=(^J\g(x)\dx 



and the right-hand side of (2.10) is indeed equal to 
the right-hand side of (2.12). 

In practice, it is impossible to use the "best" den- 
sity. To obtain the "best density", we must ascertain 

b 

the value of the integral f\g(x)\dx; however, evalua- 

a 

tion of the last integral is just as difficult as evalua- 
tion of the integral (2.8). Therefore, we restrict our- 
selves to the following suggestion: It is desirable that 
p(x) be proportional to \g(x)\. 

Of course, very complex p(x) should not be se- 
lected, since the modeling of £ would become exces- 
sively laborious. But it is possible to use the sugges- 
tion given above as a guide in choosing p(x) (compare 
this with the following numerical example). 

Estimate (2.9), with a density similar to \g(x)\, is 
called importance sampling. 

We recall that in practice, one-dimensional inte- 
grals of the form (2.8) are not computed by the Monte 
Carlo method — quadrature formulas provide greater 
accuracy for calculating such integrals. But the sit- 
uation changes when we turn to multidimensional 
integrals: quadrature formulas become very complex 
or inefficient, while the Monte Carlo method remains 
almost unchanged. 
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Fig. 2.13. The integrand y = 
sin x and two densities. 

A Numerical Example 

Let us approximately compute the integral 



tt/2 

I=jsinxdx 





The exact value of this integral is known: 



tt/2 

/ sin x dx = [ — cos x] 



Tt/2 



We shall use two different random variables £ for 
the calculation: one with constant density p(x) = 2/ it 
(that is, a uniform distribution in the interval < x < 
7r/2), and the other with linear density p(x) — 8x/w 2 . 
Both of these densities, together with the integrand 
sin x, are shown in Figure 2.13. It is evident that the 
linear density agrees better with the recommendation 
of importance sampling, that it is desirable for p(x) to 
be proportional to sin x. Thus one may expect that 
the second approach will yield the better result. 



chapter 2 63 



first approach 

Let p(x) = 2/7r for all a; in (0, w/2). The formula for 
modeling £ can be obtained from (1.24) for a = and 

b = tt/2: 

e w 
t=2 7 

Now Equation 2.9 takes the form 

N 

I « — - V^ sin £,■ 

Let AT = 10. For values of 7 let us use groups of 
three digits from Table 1.1, multiplied by 0.001. The 
intermediate results are listed in Table 2.1. The final 
result of the computation is 

I » 0.952 



second approach 

Now let p(x) = 8x/ir 2 . For the modeling of £ we use 
Equation 1.23: 

" 8x , 

ax = 7 



/ 




whence, after some simple calculations, we obtain 

Equation 2.9 takes on the form 



2 AT . ^ 



Let N = 10. We use the same random numbers 7 
as in the first approach. The intermediate results are 
listed in Table 2.1. The final result of the computation 
is 

las 1.016 
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Table 2.1 Intermediate Results 





lj 


First approach 


Second approach 


3 


ti 


sin £j- 


ti 


sin £j 



1 


0.865 


1.359 


0.978 


1.461 


0.680 


2 


0.159 


0.250 


0.247 


0.626 


0.936 


3 


0.079 


0.124 


0.124 


0.442 


0.968 


4 


0.566 


0.889 


0.776 


1.182 


0.783 


5 


0.155 


0.243 


0.241 


0.618 


0.937 


6 


0.664 


1.043 


0.864 


1.280 


0.748 


7 


0.345 


0.542 


0.516 


0.923 


0.863 


8 


0.655 


1.029 


0.857 


1.271 


0.751 


9 


0.812 


1.275 


0.957 


1.415 


0.698 


10 


0.332 


0.521 


0.498 


0.905 


0.868 



As we anticipated, the second approach gives the 
more accurate result. 

error estimates 

From the values given in Table 2.1, one can ap- 
proximate the variances D77 for both cases. (The com- 
putation equation can be found under The Simplest 
Scheme for Estimating Quality earlier in this chap- 
ter.) For the first approach 

2/ 10 10 v 

D ^f4(E( si ^i) 2 -i(E si ^i) 2 
\/=i j=i j 



= —(4.604 -3.670) = 0.256 
36 



For the second approach 



D77 : 




-ME 



sin £j 
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7T 4 



= i=g(6-875 - 6.777) = 0.016 

Though the sample size N = 10 is small and the 
applicability of the central limit theorem cannot be 
guaranteed, let us estimate the probable errors r^ = 
0.6745(D»?/./\0 1 / 2 for both computations. For the first 
approach, r N = 0.103, while for the second, r N = 0.027. 
Clearly, the actual absolute errors — 0.048 and 0.016, 
respectively — are of the same order of magnitude as 
these probable errors. 

The exact values of the variances D?/ are 0.233 and 
0.0166. One can see that the second approach, which 
includes importance sampling, is also more accurate 
for estimating the variance. 
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information 



on pseudorandom numbers 

Most of the algorithms for generating pseudoran- 
dom numbers have the form 



Ik =$(t*-i) for * = 1, 2, 



(3.1) 



If the initial number 70 is fixed, all the successive 
numbers 71 , 72 , ... are calculated by the same Equa- 
tion 3.1. The mid-square method (see Pseudorandom 
Numbers in Chapter 1) also has the form of (3.1): a 
set of operations is given that is to be applied to the 
argument x to obtain the value y (rather than the 
analytical expression of the function y = $(x)). 

The Basic Difficulty of Selecting $(x) 

Let us prove that the function $(x) shown in Fig- 
ure 3. 1 cannot be used for generating pseudorandom 
numbers by means of Equation 3.1. 

Indeed, we may consider a sequence of points with 
Cartesian coordinates 



(71. 72), (73, 74), (75, 7e), 




1 X 

Fig. 3.1. A function that 
should not be used for 
generating pseudorandom 
numbers. 

inside the unit square 0<x<1,0<j/<1. Since here 
we have 72 = $(71), 74 = $(73). 7e = $(75). ■ . . . all these 
points are located on the curve y = $(z). This, of 
course, is unacceptable, because true random points 
must uniformly fill the whole square. 

Thus, we conclude that a function y = $(x) can 
be successfully used in Equation 3. 1 only if its graph 
provides a sufficiently dense filling of the square! 

An example of a function possessing this property 
is 

y={gx} (3.2) 

where g is a very large integer, and {z} denotes the 
fractional part of z. Function (3.2) is plotted in Fig- 
ure 3.2 for g = 20. (The reader can imagine what this 
graph looks like at g = 5 17 .) 

The Congruential Method 

The most popular algorithm for generating pseu- 
dorandom numbers was suggested by D. H. Lehmer 
in 1949. This algorithm is based on Equation 3.2, but 
to avoid round-off errors, the calculation formulas are 
written in a different form. 
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y={20x} 



o i x 

Fig. 3.2. A function whose 
graph fills the square. 

A sequence of integers m k is denned in which the 
initial number m is fixed, and all subsequent num- 
bers mi, m.2, ... are computed by one formula: 



m k = gm k _i(mod M) 



(3.3) 



for k - 1, 2, . . .; from the numbers m k we calculate 
the pseudorandom numbers 



7* = m k /M 



(3.4) 



The integer g is called multiplier, and the integer M 
is referred to as the modulus. 

Relation (3.3) means that the number m k is equal 
to the remainder left when gmk-x is divided by M. In 
the theory of congruence (see any textbook on num- 
ber theory) this remainder is known as the least pos- 
itive residue modulo M. Triplets (g, M, m ) can be 
found that produce relatively satisfactory sequences 
of pseudorandom numbers. 

Example 1 

Let g = 5 17 , M = 2 40 , m = 1. This generator has 
been successfully used in Russian computers oper- 
ating with 40-bit numbers. Here both formulas (3.3) 
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and (3.4) are easily realized: carry out a double pre- 
cision multiplication g ■ m k -i, and use the low-order 
digits of the product. The period of the sequence m k 

is P = 2 38 «2.7-10 u . 

Example 2 

Let g = 65539 = 2 16 + 3,M = 2 29 , m = 1. This is 
the infamous RANDU generator, supplied by IBM in 
their 360 series and included in the Russian ES com- 
puters' software. Many users have obtained unsatis- 
factory results in computations using pseudorandom 
numbers produced by the RANDU generator. A clear 
proof that this multiplier is inadequate can be found 
in Forsythe et al. 5 (Nevertheless, there are still at- 
tempts to carry out computations using RANDU.) 

Pseudorandom Numbers 
for Personal Computers 

Ordinary congruential generators are not suitable 
for computers with short words: if the modulus is 
small, the period is small; if a long word generator is 
implemented "by parts", generation is slow. 

It was proposed by Wichman and Hill to use in 
parallel three very short word generators, 6 

m k = 171mjfe_i(mod 30269) 

m' k = 172m' i ._ 1 (mod 30307) 

m'l = 170mi'_i(mod 30323) 

and to consider as pseudorandom numbers the frac- 
tional parts 

' \ 30269 30307 30323 J 

This algorithm can be used in any computer with a 
word length of 16 (or more) bits. The period of the 
sequence j k is P ss 6.9 ■ 10 12 . 

Numerical experiments by Levitan and Sobol' 7 
show that, contrary to the suggestion of Wichman 
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and Hill, 6 the initial values m , m' , m' ' should not be 
selected at random. A good sequence can be obtained 
with m = 5, m' = 11, m'o = 17 (almost ten million j k 
were tested). 

Program 

FUNCTION RANDOMO 

C RETURNS A PSEUDO RANDOM NUMBER UNIFORMLY 

C DISTRIBUTED BETWEEN AND 1 

C SET IX=5, IY=11, IZ=17 BEFORE FIRST ENTRY 

C INTEGER ARITHMETIC TO 30323 REQUIRED 

COMMON/RAND/IX, IY,IZ 

IX=171*M0D(IX, 177)-2*(IX/177) 

IY=172*M0D(IY, 176)-35*(IY/176) 

IZ=170*M0D(IZ, 178)-63*(IZ/178) 
C 

IF(IX.LT. 0)IX=IX+30269 

IF(IY.LT.0)IY=IY+30307 

IF(IZ.LT. 0)IZ=IZ+30323 
C 

RAND0M=M0D CIX/30269 . +IY/30307 . +IZ/30323 . , 1 . ) 

RETURN 

END 



on methods for generating 
random variables 

This section contains a brief description of the 
most important transformations used for modeling 
random variables. These transformations are clas- 
sified according to the quantities of random numbers 
that are used for obtaining a single random value. 
This classification was introduced by Sobol', 3 and is 
of crucial importance for quasi-Monte Carlo methods 
(see Constructive Dimension of a Monte Carlo Algo- 
rithm below). 
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Transformations of the Type £ =g(j) 

The most important transformation of this type is 
the method of inverse Junctions (sometimes called the 
direct method). We will now show that methods of 
modeling discrete and continuous random variables 
described in Chapter 1 (see Transformations of Ran- 
dom Variables) are special cases of the general inverse 
functions method. 

First, let us recall that the probability distribution 
Junction F(x) of an arbitrary random variable £ (or 
the cumulative distribution Junction), is defined for all 
-oo < x < oo by the relation 

F{x) = P{£ < x} 

(refer to any textbook on probability theory). It is a 
non-decreasing function, and both limits exist: 

lim F(x) = 

r— > — oo 

and 

lim F(x) = 1 

x— *oo 

However, F(x) is not necessarily strictly monotonic: 
there may be intervals of constancy. Also, it is not 
necessarily continuous — there may be jumps. 

Second, assume that the function y = F(x) is con- 
tinuous and strictly monotonic (Figure 3.3). Then a 
continuous inverse function x = G(y) exists, and for 
all -oo < x < oo and < y < 1 

G{F(x)) = *, F(G(y)) = y (3.5) 

It is easy to prove that the distribution function of 
G(j) is equal to F(x). Clearly, 

P{G( 7 )<*}=P{i ! '(G(7))<F(x)} 

= P{y<F(x)} 

Since 7 is uniformly distributed over the unit interval, 
the probability 

P{7 < F(x)} = P{0 < 7 < F(x)} 
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y=F(*) 





Fig. 3.3. Continuous and 
strictly monotonic distribu- 
tion function y = F(x) and its 
inverse. 

is equal to the length of the interval (0, F(x))\ so, 

P{G(f)<x} =F(x) 

Hence, a random variable £ having such a prob- 
ability distribution function F(x) can be modeled by 
the equation 

i = G(j) (3.6) 

However, the variable 1 — 7 is uniformly distributed 
over the same interval as 7. Therefore the formula 



£ = G(l-7) 



(3.7) 



can be used instead of (3.6). 

Both transformations (3.6) and (3.7) represent the 
method of inverse functions. 

The General Method 

Consider now an arbitrary random variable £ with 
an arbitrary distribution function F(x), so that a clas- 
sical inverse function G(y) may be multivalued or not 
denned for all < y < 1. In this case a generalized 
inverse function can be introduced that does not sat- 
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y 

1 1- 



x =G(y) 



Fig. 3.4. Distribution 
function y = F(x) of 
a continuous random 
variable and its inverse. 

isfy (3.5), but that does satisfy a weaker requirement: 

that inequalities G(y) > x and y > F(x) be equivalent. 

Then the previous proof can be slightly modified: 

P{G(j)<x} = l-P{G(y)>x} 

= 1 - P{T > F(x)} = P{ T < F(x)} = F(x) 

Example 1 

Consider the continuous random variable £ from 
Modeling of a Continuous Random Variable in Chap- 
ter 1. The curve shown in Figures 1.9 and 1.10 is 
the nontrivial part of the distribution function 



F(x) = P{£ < x) = P{a < £ < x} 

X 

— J p(x) dx for a < x < b 



(3.8) 
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Fig. 3.5. Distribution function y = 
F(x) of a discrete random variable 
and its inverse. 

The full distribution function y = F(x) is shown in 
Figure 3.4, together with the inverse function x = 
G(y). Here the "generalization" is very simple: we 
exclude the multivalued tails at x < a and x > b. 
Clearly, Equation 1.23 is identical to Equation 3.6. 

Example 2 

Consider the discrete random variable £ from Mod- 
eling of a Discrete Random Variable, in Chapter 1. Its 
distribution function y = F(x), together with the gen- 
eralized inverse function x = G(y), can be seen in 
Figure 3.5 (for the special case n = 4). If we select a 
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12 x 

Fig. 3.6. Distribution function 
y = F(k) of a mixed random 
variable and its inverse. 

value y = j on the y axis, the corresponding G(j) is 
equal to one of the X{, and 

P{G( 7 ) = x i }= Pi 

The "generalization" here is somewhat different: we 
add vertical bars (so that the graph becomes contin- 
uous) and then exclude the horizontal bars (so that 
the function G(y) becomes univalent). 

Example 3 

Now consider a random variable £ of a mixed type: 
£ = 1, with probability 0.6, and it is uniformly dis- 
tributed in the interval < x < 2 with probability 0.4. 
In Figure 3.6 the discontinuous distribution function 
y — F(x) and the generalized inverse function x — G{y) 
are shown. 



78 additional information 



x 

/ s 

I ° 




~7~ *-^ \ 






'*A V 


z A\v^ 







Fig. 3.7. A random point on 
a sphere and its spherical co- 
ordinates. 



Since 



0.2x 

0.6 + 0.2x 



for < x < 1 
for 1 < x < 2 



F(z) = { 
the formula for modeling £ according to (3.6) is 




if 0<7<0.2 
if 0.2< T <0.8 
if 0.8 < 7 < 1 



a random direction in space 

The direction may be specified by a unit vector 
from the origin whose endpoint lies on the surface of 
the unit sphere. The words "all directions are equally 
probable" mean that the endpoint Q is a random point 
uniformly distributed on the surface of the sphere. 
The probability that £1 belongs to an element of the 
surface dS is equal to dS/(4ir). 

Consider spherical coordinates (<p, xj)) on the 
sphere with the polar axis Ox (Figure 3.7). Then 

dS = sin ip dip dip 
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where < <p < n, <ip <2ir. Let us denote by p(p, ip) 
the probability density of the point Q = (<p, ip). The 
requirement 

p(<p, ip) dip dip = dS/(4ir) 

together with the last expression for dS, yield the re- 
lation 

p(ip, ip) = (47r) _1 sin <p 

Marginal densities of <p and ip can be found easily 
from their joint density: 

Pi(f) - I Piv, ^)dip~\ sin ip 
o 

P2(il>) =fp(<p, i>)d<p= i- 



From the identity p(ip, ip) = pi(ip)p2(ip) we conclude 
that ip and ip are independent and may be modeled 
independently using two random numbers. 

First, we apply the method of inverse functions for 
modeling tp. Since for < <p < ir 

<p 
F{ip) = / pi((p) dip = -(1 - cos <p) 



we may use (3.7): 

- (1 - cos ip) = 1 - 7 

from which it follows that 

cos ip = 27 - 1 (3.9) 

Then we apply the same method (in fact, Equa- 
tion 3.6) for modeling ip that is uniformly distributed 
between and 2n: 

ip = 2tt7 (3.10) 

Equations 3.9 and 3.10 make possible the choice 
of a random (isotropic) direction in space. Of course, 
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the values of 7 in (3.9) and (3.10) are two different 
random numbers. 

Transformations of the Type £ =g(ji, 72) 

The most important transformation of this type is 
the superposition (or composition) method. 

Assume that the probability distribution function 
F(x) can be regarded as a superposition of several 
distribution functions Fi(x), . . ., F m (x): 

m 

F{x) = Y J CkF h {x) (3.11) 

with all c A > and ci + . . . + c m = 1. Let us assume 
further that random variables with distribution func- 
tions Fk{x) can be modeled, for example, using in- 
verse functions Gk(y)- 

Then we may introduce a random integer x with 
distribution 

1 2 - m ) (3.12) 

Cl c 2 ... c m J 

so that P{x=fc} = cfc, now we can define a two-stage 
modeling procedure: selecting two random numbers 
71 and 72, 1) use 71 to define a random value x = k 
and 2) use j 2 to define £ = Gjt(7 2 )- The distribution 
function of £ is F(x). 

proof 

We may use a well-known expression for the ab- 
solute probability: 



'{g*M 



< x 



= E P { G - M < *l*= *}p{* = *} 

The conditional probabilities here are evident: 

p[g x ( T2 ) <x\x=k\=P{G k (j2) <A=F k (x) 
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Fig. 3.8. Scattering angle 9. 

Hence, 

p{g x (t 2 ) <x\ = f;n(^ = F(x) 
"> ' k=i 

Clearly, if all corresponding densities exist, we may 
consider the superposition of densities rather than 
(3.11): 

m 
P( x ) = ^2 c kPk{x) 

fc = l 

Example 1 

When photons are scattered by cool electrons, the 

scattering angle (Figure 3.8) is a random variable, 

and its cosine /i = cos 9 obeys Rayleigh's law: 

3 
p(x) = -(1 + x 2 ) for - 1 < x < 1 

8 

An attempt to apply the method of inverse functions 
for modeling \i yields a cubic equation 

i(/z 3 + 3// + 4) = 7 

As an alternative, let us consider a representation 
of the density 

p(x) = 0.75 pi (x) + 0.25 v-i (x) 

with p ± (x) = 0.5, which is a constant density, and 
P2(x) = 1.5a; 2 . The corresponding distribution func- 
tions are quite simple: 

Fi(x) = \{x + 1) 
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and 






W = \{x 3 + 1) 




In both 
talning 


cases 
a final 


we can use inverse 
explicit formula: 


functions for ob 






H 


r 272 - 1 if 71 

, (2 T2 - I) 1 / 3 if 71 


<0.75 
>0.75 



Example 2 

Consider a discrete random variable 



X\ ... x n 
Pi ■■■ Pn 



(3.13) 



where all the probabilities p, are of the form p t = m t 2 -5 
with integers m,-, 1 < m,- < 2* - 1. Assume that s is 
much smaller than n. Then it may be expedient to re- 
gard £ as a superposition of several (not more than s) 
random variables with equiprobable values, since we 
know that such variables can be modeled easily. We 
will illustrate this approach with a numerical exam- 
ple. 

Let the number of possible values in (3.13) be 
n — 19, and all the probabilities p t = mi/64; the nu- 
merators m, are given in Table 3. 1 . On the right-hand 
side of Table 3. 1 the same m, are written in the binary 
system; v k is the number of ones in the kth column. 

Hence it follows that £ can be represented in the 
form of a superposition of three random variables f (*), 
with k = 4, 5, 6. The variable £( 4 ) assumes values 
xi - x 8 with probability 1/8; the variable £( 5 ) assumes 
values xi, x 2 , x 9 - z 16 with probability 1/10; the vari- 
able £( 6 ) assumes values x 3 - xe. x 9 - i 13 , x i7 - £19, 
with probability 1/12. The corresponding coefficients 
Cfc can be computed from the relation c k = v k 2~ k , and 
are equal to c 4 = 1/2, c 5 = 5/16, c 6 = 3/16. 

Denote 

(2/1, 2/2, •■•, yio) = (xi, X 2> Xg, X10, ••-, Xi 6 ) 
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Table 3.1. Numerators m { 
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Jb 
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1 2 


3 4 
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3 
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14 
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15 
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16 
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17 
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18 
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19 
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10 


12 
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and 

(zi, z 2 , ..., z i2 ) = (x 3 , x 4 , x 5 , x 6 , x 9 , x 10 , 

. .., X13, x 17 , x ls , x 19 ) 

Then the final formula for modeling £ can be written 
in the following form: 

( Xi i=l + [872] if Ti < 1/2 

£ = i jfc * = 1 + [IO72] if 1/2 < 71 < 13/16 
[ Zi i = l + [1272] if 13/16 < 71 

In this example the general method for modeling 
a discrete random variable would require many com- 
parisons of 7 with pi, Pi + P2, Pl + P2 + P3, 

modeling of a normal random variable 

Consider a normal random variable £, with 
a = and a = 1. Let 77 be a random variable with 
the same distribution, but independent of £. Then 
the probability density of a random point with Carte- 
sian coordinates (£, 77) in the (x, y) plane is equal to 
the product of one-dimensional densities: 

Let us introduce polar coordinates 
x — r cos <p and y = r sin <p 
and denote by p, the polar coordinates of the point 

£ = p cos 6, T) = p sin 
The joint density of p and can be computed eas- 
ily: 

d(x, y) 



p(r, <p) = p(x, y) 



— e~ r I 2 
2tt 



d{r, <p) 

Then the individual densities of p and 6 can be found 
by integration: 

2?r 



-r 2 /2 

re ' 



pi(r) = / p(r, y?)dv5: 
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and 

oo 

P2(<P) - J P(r, f) dr -^ 
o 
Since pi(r)p 2 (<p) = p(r,<p), we conclude that p and 6 
are independent and can be modeled independently 
using the corresponding distribution functions 

F 1 (r) = l-e- ra / 2 

and 

F 2 (<p) = <pf{2v) 

here < r < oo, < ip < 2tt. 
From the equations 

F 1 (p) = l- 71 

and 

F 2 {6) = l2 

we obtain explicit expressions 

p={-2\n ll ) 1 ' 2 , ip = 2irj 2 
The final formulas 

f = (-2 In 7i fl 2 cos 27T72 

and 

77 = (-2 In 71 )*/ 2 sin 2^72 

allow the computation of two independent normal 
variates (with a = and <r = 1) from two random num- 
bers 7! and 72. However, each of these formulas is of 
the type £ = g( 7l , l2 ). 

Transformations of the Type £ =0(71, . . . , 7n ) 

Example 1 

The random variable £ with distribution function 
F(x) = x n for < x < 1 can be modeled by the equation 

f = max(7i; ...; j n ) (3.14) 
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Indeed, the probability 



P{£ < a;} = P< max y, < x 

{ 1<8<" 
= P{7i < X, ..., J n < x} 

and, since all the random numbers are independent, 

P{7i <x, ..., y n < x} 

= P{ti < x} ... P{ 7 „ <x} = x n 

The same random variable £ can be modeled by the 
method of inverse functions (3.6). Then the equation 
£" = 7 yields the expression 

£ = Vr (3.15) 

Comparing (3.14) with (3.15) we conclude that, 
rather than extract the nth root of a random num- 
ber, one may select the greatest among n random 
numbers. 

Example 2 

In Chapter 2 the simple flow of events was consid- 
ered. It was characterized by independent exponen- 
tially distributed time intervals r between two suc- 
cessive events. In more general flows, called Erlang 
flows, the time intervals r are independent random 
variables with density 

p( x ) = "" . a:"" 1 e~ ax for < x < oo (3.16) 

(n — lj! 

Since the integral 

oo 

r(n)= I x n ~ l e~ x dx = {n - 1)! 

o 

is called the gamma-junction, the density (3.16) is 
called the gamma-distribution. 
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It can be proved that a random variable £, with 
density (3.16), may be modeled by the equation 

£ = In (71 ... 7„) 

We omit the proof (by mathematical induction), and 
mention only that Equation 2.2 is a special case of 
the last equation, namely, when n — 1 . 

the use of order statistics 

Assume that n random numbers ji, . . ., j n are re- 
ordered 

7(1) < 7(2) < • • • < 7(n) 

The value 7^) is called the sth order statistic of the 
uniform distribution. Clearly 7^) depends on all 

71, . . . , 7„. The density of 7 (s) is 

< x < 1 (3.17) 

sketch of a proof 

We fix an arbitrary interval (x, x + Ax) inside the 
unit interval. When a value of 7 is selected, one of 
three events may occur: 

M = {7 < x}, A 2 = {x < 7 < x + Ax}, 

A 3 = {x + Ax < 7} 
The probabilities of these events are 

Pi - x, p 2 = Ax 

and 

P3 = 1 — x — Ax 

Consider now a sequence of n independent values 

71 , . . . , 7„ . In these n trials event A, will occur v t 
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times, so that v 1 -\-v 2 + V3 = n. If mj + m,2 + rnz = n, and 
all < m, 1 < n, then, according to the polynomial law, 

Pji/! = mi, v 2 = m 2 , f3 = ^3} 



mi'.m2\m3'. 
One can easily verify that 






P 



'\ x < T(s) < a; + Ax > 

= PJ 1/1 = * - 1, i/j = 1, va = n - s \ + Oi (Axf i 
Then the density p(x) of j^ can be computed: 

p(x) = lim I P{x < 7( s ) < x + Aaj}/Aa; ) 

= n ("ll 1 ) ar, ~ 1 ( 1 - X ) n "' 

Various random variables with densities (3. 17) can 
be modeled using various order statistics. Assume 
that positive integers s and t are given and denote 
n = s + 1 - 1. Then the density (3.17) turns into a 
beta-distribution 

V(x) = £ t) > for < x < 1 (3.18) 

where the beta-function 

50, <) = r(*)r(0/r(* + *) 

= (s-l)!(t-l)!/(« + <-l)! 

Therefore, for modeling a beta-distributed random 
variable £, one has to select s + t - 1 random numbers 
Tl , . . . , 7,+i_i, re-order them T(1) < . . . < 7( 5 +*-i). and 
seU = 7 (5) . 

Algorithms for modeling gamma- and beta-distrib- 
uted random variables with non-integer parameters 
can be found in Rubinstein. 8 
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Transformations 

of the Type ( =g(-yi, . . . , 7n, • • •) 

This type of transformation includes transforma- 
tions that depend on a random quantity of random 
numbers, so that the number of 7-s used for com- 
puting of a particular value of £ may be different and 
even unrestricted. Of course the average number of 
random numbers required for computing one value of 
f must be finite. 

The most frequently used transformations of this 
type are the rejection techniques, von Neumann's 
method for modeling continuous random variables 
is one of the earliest examples in which these tech- 
niques were applied. 

The rejection technique utilizes a computation equa- 
tion that has an acceptance rule. For example, 

£ = 0(71. ■••> 7m) if Kfi, ..., J m ) > 

For modeling the variable £, we select m random 
numbers 7^ ..., j m . If the acceptance rule is met, 
that is, if h(ji, ..., 7 m ) > 0, we compute f = g(j ly ..., 
7 m ); otherwise, the set j lt . . ., -y m is rejected, and new 
numbers jx , . . ., j m are selected. 

The acceptance probability 



e = plh( 7l , ..., 7m)>0| 



is called the efficiency of the technique. 

If N random groups (yx, ..., j m ) are considered, 
then, on the average, only eN values of £ are com- 
puted. Thus, eN values of £ require mN random 
numbers, and, on the average, m/e random numbers 
have to be consumed in order to obtain one value of 
£. Clearly, when e — ► this modeling method becomes 
inefficient. 



90 additional information 



generalized von neumann's method 

Assume that the density p(x) of a random variable 
£ can be represented as a product of the form 

p(x) = Pl (x)f(x) 

where p x (x) is the density of an auxiliary random vari- 
able t)' ', and f(x) is bounded: f(x) < C. 

If we know how to model 77' we can construct a 
method for modeling £: 

1 . Find t)' and an independent random number 7; 
compute 77" = Cj. 

2. If 77" < f(r)'), set £ = 77'; otherwise reject 77' and 
7 and repeat (1). 



proof 

The point (77', 77"), in Figure 3.9, will be in the 
strip (x, x + dx), with probability p x {x)dx. Since 77" 
is uniformly distributed over the interval < y < C, 
the conditional probability of acceptance is f(x)/C. 
Accordingly, the probability of 77' falling into (x, x + 
dx) without being rejected is equal to the product 
Pi(x)dx f(x)/C, and is thus proportional to p(x)dx. 

The efficiency of this method can be computed by 
summing the conditional acceptance probabilities: 

e= I -j^-pi(x) dx-— I p(x) dx = — 

a a 

von Neumann's method is a special case of the last 
method, corresponding to the choice of a constant 
density pi(x) = l/(6-a) and f(x) = (b-a)p(x) in a finite 
interval a < x <b. The restriction p(x) < M yields the 
bound f(x) < (6 - a)M . Therefore, the efficiency of 
von Neumann's method is e = ((6 - a)M )~ ■ 

The same result can be obtained directly from Fig- 
ure 1.11: since the random point T(r)' , 77") is uniformly 
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x+dx x 

Fig. 3.9. Random points 
that appear above the 
curve y = f(x) are rejected. 

distributed over the rectangle a < x < b, < y < M , 
the acceptance probability is equal to the ratio of the 
area under the curve y = p(x) to the area of the rect- 
angle: 

b 

e= fp(x)dx/((b-a)Mo) = ((b-a)M o y 1 



Example 

Consider the density p(x) = v(x)x~ 1 ^ 3 defined in 
< x < 1, with a bounded function v(x) < A. 

Let us choose an auxiliary density pi(z) = 
(2/3)a; _1/3 that can be treated by the inverse func- 
tions method. Then the corresponding f(x) is bound- 
ed: 

/(*) = P (x)/ Pl (x) = (3/2)t,(s) < (3/2)A 

The acceptance rule rf' < /(?/) can be divided by 3/2 
and the algorithm for modeling £ with density p(x) 
thus acquires the form: 

(1) Select 7! and 72 ; compute 77' = (71 ) 3 ' 2 

(2) If A12 < v{v')< set £ = T ]'' otherwise reject the 
pair 7 1; 72 and repeat (1). 
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on monte carlo algorithms 

Time Consumption of a Monte Carlo Algorithm 

Suppose that we are interested in a certain quan- 
tity m, and we have defined a random variable £ so 
that its expectation M£ = m, and its variance D£ 
is finite. Then we can select N independent values 
£1, ■ • • , £n of this variable, and form the estimate 



N 

i 

m 



TrEti < 3 - 19 ) 



N 

which is usually called a Monte Carlo method for es- 
timating m. We have seen that the accuracy of the 
estimate (3.19) depends on D£/jV (see The General 
Scheme of the Monte Carlo Method, Chapter 1). How- 
ever relation (3.19) still does not determine the com- 
putation algorithm; one requires an equation for mod- 
eling £ by means of standard random numbers. 
Let us specify the modeling equation 

£ = ff(7i, 72, -..) (3.20) 

Both relations (3.19) and (3.20) completely define the 
Monfe Carlo algorithm for estimating m. 

Let t denote the computer time expended in calcu- 
lating a single value of £ from (3.20). Then the total 
computing time of (3.19) is T = Nt. 

Consider the expression for the probable error of 
estimate (3.19) 

r N = 0.6745 y/^/ N 
and substitute N = T/t. Then 



r N = 0.6745 V*D{/r (3.21) 

The last formula shows that if the total computer time 
T is held fixed, then the probable error of a Monte 
Carlo algorithm depends on the product t ■ D£, which 
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is called the time consumption (or laboriousness) of 
the algorithm. The less laborious the algorithm, the 
smaller the probable error. 

In most problems, both D£ and t are estimated 
numerically from relatively small samples, however, 
sometimes this can be done analytically. 

An example with splitting 

Consider the problem of estimating the expecta- 
tion m = MO of a random variable 6 = /(£, 77). Assume 
that subroutines for modeling £ and 77 are available, 
as well as a subroutine for computing /(£, 77). Assume 
also that the variance DO is finite. 

Clearly, we can model independent values Oj = 
f(£j, rjj) and use the estimate 

N 



hE 9 * < 3 - 22 ) 



""-AT 

3=1 

Let t% and tr, be the computer time required to 
model £ and 77 respectively, and let tf be the computer 
time required for one calculation of /(£, 77). Then the 
time consumption of the algorithm (3.22) is t-DO, with 
t = t ( + t v +t f . 

Now assume that the weak point of our computa- 
tion algorithm is the modeling of £, or, more precisely, 
t% > i v +tf. Then it may be worthwhile to consider 
averaged values 

rather than 9, so that each £ is coupled with s in- 
dependent values of 77. The variance of 0<» can be 
expressed in the form 

D0W = D0(1 - r + rs)/s (3.23) 

where r is a constant < r < 1. The computing time 
of 0(*) is equal to 

* W =t ( + 8(t„ + tf) 
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(The constant r is the correlation coefficient of 
the random variables /(£, 77') and /(£, 77"), with 77' 
and 77" being independent values of 77. For obtaining 
D#( s ) the variance of a sum of random variables must 

s s 

be computed using the formula D E h — E D /fc + 
2E E r(f k , tfy/DfkBf,:) 

l<k<j<s 

We can write down the generalized estimate 

1 N s 

™*A^E£/fe, m) (3.24) 

i=i *=i 

and the time consumption of the corresponding algo- 
rithm is 

f(')D0(*) = D0(1 - r + rs)(t n +t f + t ( /s) 

The last expression has one minimum in the region 
< s < 00, namely, at s = s„, where 



«» = V /(l/r-l)* e /(^+*/) 



Numerical example 

Let < f « 100(f, + */) and r « 0.2. Then s* « 20. The 
time consumption of (3.24) with the optimal s = s* is 
3.5 times less than the time consumption of (3.22) 
(Figure 3.10). 

Turning from to 0(*) is a very special case of a 
general Monte Carlo technique termed splitting of a 
trajectory. In our example the "trajectory" is one path 
from £ to 77. 

Constructive Dimension 
of a Monte Carlo Algorithm 

Consider an arbitrary algorithm of the type (3. 19)- 
(3.20) for computing of an arbitrary quantity m = M£. 
If Equation 3.20 depends on n variables only, that is, 
if 

£ = 9(71, •••> 7n) 
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Fig. 3.10. The optimum value s = s,. 

we say that the constructive dimension of the algo- 
rithm (3.19)-(3.20) is n. For the sake of brevity we 
will write c.d.= n. 

In other words, the c.d. is equal to the number of 
random numbers used for computing one value of £. 
More precisely, the c.d. is the maximum number of 
random numbers necessary for carrying out one trial. 
For example (see Computation of Neutron Transmis- 
sion Through a Plate in Chapter 2), different neu- 
trons undergo different numbers of scatterings; the 
amounts of random numbers used for computing one 
value of t) are different also. Theoretically, the number 
of scatterings is unlimited, therefore, this algorithm 
has c.d. = oo. However, if the number of scatterings 
that must be considered is a priori restricted, the c.d. 
of the algorithm is finite. 

If an intermediate random variable £' (in the algo- 
rithm (3.19)-(3.20)) is modeled by an equation of type 
£' = g'(ji, . . ., j s )< then the c.d. of the algorithm in- 
creases as s increases. Hence, the method of inverse 
functions is the best method if one wants to mini- 
mize the c.d. On the other hand, if f is modeled by 
rejection technique, then the c.d. is infinite. 

Arbitrary Monte Carlo algorithms with c.d.= n can 
be interpreted as numerical integrations in n dimen- 
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sions. Indeed, let r be an n-dimensional random 
point with Cartesian coordinates (7!, ..., j n ). The 
point r is uniformly distributed in the n-dimensional 
unit cube 

{0<x x < 1, ..., 0<ar n < 1} 
since its density is constant inside the cube: 

Pr(xi, ..., x n ) = p 11 (xi) ...p ln (x n ) = l 

In this case relation (3.20) is equivalent to £ = g(T). 

Now consider the quantity m that we are estimat- 
ing m = M#(r). Using a multidimensional analog of 
(1.18), we obtain 



1 1 
= ... g(xi, .. ., x n )dxi ... dx n 



Therefore, the relation (3.19) can be rewritten as 

1 1 

/ ... / g(xi, ..., x n )dxi ... dx n 


*j^I>(r,-) (3-25) 

It is clear from (3.25) that our estimate is an approx- 
imation of an integral over the n-dimensional unit 
cube, derived by averaging the values of the integrand 
at independent random points Ti, . . . , T N uniformly 
distributed in the cube. 

Quasi-Monte Carlo Method 

In 1916 H. Weyl found that infinite sequences of 
non-random points Q 1} Q 2> ...,Qj, ■■■ exist, which 
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have a property similar to (3.25): for an arbitrary 
Riemann-integrable function g(xi, . . ., x„) 



/ ... / g(x lt ..., x n )dxi 



dx n 



1 N 

vE^i) ( 3 - 26 ) 



= lim A7 

j = l 

Such sequences are said to be uniformly distributed 
in the number-theoretical sense. 

Number-theoreticians were initially interested only 
in the asymptotic behavior of these sequences. How- 
ever, sequences intended for numerical computations 
must satisfy not only (3.26), but several additional re- 
quirements also (see Sobol'): 3 The uniformity of dis- 
tribution should be optimal as N — > oo; the uniformity 
of distribution of initial points Q 1( . . . , Q N should be 
observed for fairly small N; and formulas for comput- 
ing these points should be simple. 

The first sequences satisfying to a certain extent 
all of these requirements were the LP r -sequences con- 
structed in 1966. Programs for generating these se- 
quences can be found in Bratley and Fox (FORTRAN- 
77), 9 or in Sobol' et al (FORTRAN-77 and C). 10 

Comparing (3.25) with (3.26), one can conclude 
that if random points r ; in (3.25) are replaced by 
points Qj, then the averages converge. In practice, 
as a substitute for the value £,- = g(Tj), one has to 
use the value £,- = g(Qj), or, in other words, the jth 
trial should be carried out using Cartesian coordi- 
nates (qj 1} . . ., q in ) of the point Qj rather than ran- 
dom numbers j lt ..., j n . 

Points Q lt Q 2 , ... are often called quasi-random. 
When standard statistical tests are applied for veri- 
fying the distribution of these points in the n-dimen- 
sional cube, the results are "too good" for random 
points. (Of course these points lack many other prop- 
erties of random points as well.) 
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Fig. 3.11. Computation errors 6n 
and probable errors rjv (logarith- 
mic scale: x = log 2 N, y = log 2 6n 
or log 2 r N ). 

The main reason in favor of quasi-random points 
is the possible increase in the rate of convergence: 
while the approximation error of (3.25) for large N is 
decreasing as l/\/~N, the convergence rate of (3.26) 
can sometimes even be 1/N. Thus, without changing 
the computation algorithm (3.19)-(3.20), but merely 
by replacing the random numbers with coordinates of 
quasi-random points, we can sometimes considerably 
improve our results. 

According to Sobol' et al, 10 if all the variables xi, .. ., 
x n in g(xi, ..., x„) are equally important, and n is 
large (say, n > 15), then there is no advantage in 
switching to quasi-Monte Carlo methods. However, 
if the dependence of g(x x , ..., s;„) on i; decreases as 
i is increased (in other words, the initial coordinates 
are the leading ones), one can expect a considerable 
benefit from employing quasi-Monte Carlo methods, 
even if n is large (say, several tens). 
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Numerical example 

Consider the random variable 

£ = ^(1 + 271X2 + 272)... (9 + 2 79 ) 

Its expectation is M£ = 1, and its variance D£ = 
0.196. If M£ were estimated by the Monte Carlo meth- 
od (3.19) the probable error would be r N = 0.30-/V -1 / 2 . 
The corresponding quasi-Monte Carlo method 



M ^'" = ^£ 



N 1 + 2^-1 2 + 2q j2 9 + 2q j9 



N ^> 2 3 10 

3=1 

where (qj lt ..., q j9 ) is the jth point of a 9-dimen- 
sional LP T -sequence is realized. The computational 
errors 6n = \In — 1| are shown in Figure 3.11; the 
scale is logarithmic: x = log 2 N, y = log 2 6 N ; the dotted 
line is y = log 2 rjy. 

One can see that 6 N « 2/N. 
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